logo Idiap Research Institute        
 [BibTeX] [Marc21]
Sparse Pronunciation Codes for Perceptual Phonetic Information Assessment
Type of publication: Conference paper
Citation: Asaei_SPARS_2017
Publication status: Accepted
Booktitle: Workshop on Signal Processing with Adaptive Sparse Structured Representations (SPARS)
Year: 2017
Note: Proceeding of Abstracts for Communication
Abstract: Speech is a complex signal produced by a highly constrained articulation machinery. Neuro and psycholinguistic theories assert that speech can be decomposed into molecules of structured atoms. Although characterization of the atoms is controversial, the experiments support the notion of invariant speech codes governing speech production and perception. We exploit deep neural network (DNN) invariant representation learning for probabilistic characterization of the phone attributes defined in terms of the phonological classes and known as the smallest-size perceptual categories. We cast speech perception as a channel for phoneme information transmission via the phone attributes. Structured sparse codes are identified from the phonological probabilities for natural speech pronunciation. We exploit the sparse codes in information transmission analysis for assessment of phoneme pronunciation. The linguists define a single binary phonological code per phoneme. In contrast, probabilistic estimation of the phonological classes enables us to capture large variation in structures of speech pronunciation. Hence, speech assessment may not be confined to the single expert knowledge based mapping between phoneme and phonological classes and it may be extended to multiple data-driven mappings observed in natural speech.
Keywords:
Projects Idiap
PHASER-QUAD
Authors Asaei, Afsaneh
Cernak, Milos
Bourlard, Hervé
Ram, Dhananjay
Added by: [UNK]
Total mark: 0
Attachments
  • Asaei_SPARS_2017.pdf
Notes