An Empirical Model of Emphatic Word Detection
Type of publication: Conference paper
Citation: Cernak_INTERSPEECH_2015
Publication status: Published
Booktitle: Proc. of Interspeech
Year: 2015
Month: September
Pages: 573-577
Publisher: ISCA
Location: Dresden, Germany
Abstract: The paper presents an empirical model of emphatic word detection, as an alternative to conventional machine-learning-based methods. The model is based on the Probabilistic Amplitude Demodulation (PAD) that is iteratively applied for getting syllable and stress modulations, i.e., using the cascaded PAD method. The emphatic words are detected by prominent peaks of the stress modulation and by considering the peaks that are stressed or accented. The cascaded demodulation steered with general purpose values derived from 200ms long average syllable duration, yields to detection accuracy of 81%-83%. Speaker-dependent cascaded demodulation, considering specific speaking rate of the speakers, yields to detection accuracy of 86%-91%. The advantages of the proposed empirical detection model are (i) noise-robustness, (ii) language-independence and (iii) it does not require a training phase.
Keywords: Speech Analysis
Projects Idiap
Authors Cernak, Milos
Honnet, Pierre-Edouard
