logo Idiap Research Institute        
 [BibTeX] [Marc21]
Analysis of MLP Based Hierarchical Phoneme Posterior Probability Estimator
Type of publication: Journal paper
Citation: Pinto_IEEE_TASLP_2010
Publication status: Published
Journal: IEEE Transcations on Audio, Speech, and Language Processing
Volume: 19
Number: 2
Year: 2011
Pages: 225-241
Abstract: We analyze a simple hierarchical architecture consisting of two multilayer perceptron (MLP) classifiers in tandem to estimate the phonetic class conditional probabilities. In this hierarchical setup, the first MLP classifier is trained using standard acoustic features. The second MLP is trained using the posterior probabilities of phonemes estimated by the first, but with a long temporal context of around 150-230 ms. Through extensive phoneme recognition experiments, and the analysis of the trained second MLP using Volterra series, we show that (a) the hierarchical system yields higher phoneme recognition accuracies - an absolute improvement of 3.5% and 9.3% on TIMIT and CTS respectively - over the conventional single MLP based system, (b) there exists useful information in the temporal trajectories of the posterior feature space, spanning around 230 ms of context, (c) the second MLP learns the phonetic temporal patterns in the posterior features, which include the phonetic confusions at the output of the first MLP as well as the phonotactics of the language as observed in the training data, and (d) the second MLP classifier requires fewer number of parameters and can be trained using lesser amount of training data.
Keywords:
Projects Idiap
SNSF-KEYSPOT
IM2
Authors Pinto, Joel Praveen
Sivaram, G. S. V. S.
Magimai.-Doss, Mathew
Hermansky, Hynek
Bourlard, Hervé
Added by: [UNK]
Total mark: 0
Attachments
  • Pinto_IEEE_TASLP_2010.pdf
Notes