Keywords:
- acoustic modeling
- Adaboost
- Alzheimer's disease
- Anti-spoofing
- articulatory features
- Artificial Neural Networks
- atypical speech
- Automatic accent assessment
- Automatic accent evaluation
- automatic gender recognition
- Automatic speaker verification (ASV)
- Automatic Speech Recognition
- automatic subword unit derivation
- bag of audio words
- Binary features
- binary masking
- BoAW
- boosting
- breathing pattern estimation
- breathing patterns
- Children speech recognition
- Classification
- CNN visualization
- ComParE features
- computational efficiency
- Conditional Random Fields
- confidence measures
- continuous speech recognition boosted binary features resource management
- Convolution Neural Network
- Convolutional neural network
- Convolutional Neural Networks
- COVID-19 identification
- cross-database
- deep learning
- deep neural networks
- depression detection
- Direction of arrival estimation
- dynamic programming
- Dysarthria
- Dysarthric speech
- embedding
- Emotion Recognition
- end-to-end acoustic modeling
- End-to-end learning
- end-to-end modelling
- end-to-end training
- expected performance and spoofability curve
- Expressive Vocalizations
- feature selection
- Few-shot learning
- fine-tuning
- fixed-size word patterns
- Formant identification
- Formants
- Fundamental frequency
- Fusion
- Gaussian mixture
- glottal source signals.
- grapheme
- Grapheme subword units
- grapheme subwords
- grapheme-to- phoneme conversion
- grapheme-to-phoneme conversion
- grapheme-to-phoneme converter
- Graphemes
- Hidden Markov Model
- hidden Markov models
- human skeleton estimation
- integration of ASV and anti-spoofing
- Inter-pretable Models
- isolated word recognition
- Kalman filters
- KL-divergence
- KL-HMM
- Kullback-Leibler divergence
- Kullback-Leibler divergence based hidden Markov model
- Kullback-Leibler divergence based HMM
- Kullback–Leibler divergence based hidden Markov model
- language disorder
- Language Production
- letter-to-sound rules
- lexical model
- Lexical modeling
- Lexicon
- local posterior probability
- localization
- long-term statistics
- low level descriptors
- Mental Lexicon
- microphone array
- microphone arrays
- mobile biometrics
- modalities fusion
- modified ZFF
- multi- layer perceptron
- Multi-modal Approach
- multi-stream combination
- Multi-task learning
- multilayer perceptron
- multilayer perceptron network
- multilingual acoustic modeling
- multiple linear regression
- Multiple speaker localization
- multiple speakers
- multiple-stream combination
- multitask learning
- neural network
- Noise Robustness
- non-native speech
- non-native speech recognition
- Objective Evaluation
- Objective intelligibility
- Objective intelligibility Assessment
- objective measures
- overlapping speech recognition
- Paralinguistic speech processing
- Parkinson's disease
- parts-based approach
- Pathological speech
- Pathological Speech Processing
- Perceived fluency
- phoneme
- phoneme modeling
- Phoneme recognition
- phoneme subword units
- phoneme subwords
- phonemes
- Phonetic information
- phonetic representation
- Phonocardiogram
- Posterior features
- posterior probabilities
- pre-trained embedding
- presentation attack
- Presentation Attack Detection
- probabilistic lexical modeling
- pronunciation generation
- pronunciation lexicon
- Raw Speech
- raw waveform modelling
- raw waveforms
- raw-waveform cnn
- Reading Assessment
- recognition
- recurrent neural network
- Respiratory parameters
- S1-S2 detection
- Scottish Gaelic
- segment-level training.
- Self-Organizing Maps
- Self-supervised embedding
- self-supervised learning
- sign language assessment
- Sign language processing
- signal processing
- sleepiness
- speaker verification
- speaker-specific features
- spectral statistics
- Speech Analysis
- speech assessment
- Speech breathing
- Speech Emotion Recognition
- Speech enhancement
- Speech intelligibility
- speech recognition
- speech recognition.
- speech separation
- speech synthesis
- Speech technology
- Spoofing
- spoofing detection
- Steered response power
- String matching
- SVM
- syllable-level-features
- synthetic reference templates.
- Synthetic speech
- TANDEM features
- template-based approach
- template-based system
- Text classification
- text-to-speech synthesis
- tracking
- under-resource speech recognition
- under-resourced languages
- universal phoneme set
- unsupervised adaptation
- utterance verification
- voice activity detection
- Voice Conversion
- zero frequency filter
- Zero frequency filtering
- zero-frequency filtering
- zero-resourced speech recognition
Publications of Mathew Magimai.-Doss sorted by title
O
On Modeling Context-dependent Clustered States: Comparing HMM/GMM, Hybrid HMM/ANN and KL-HMM Approaches, , and , Idiap-RR-43-2013 |
|
On Modeling Context-Dependent Clustered States: Comparing HMM/GMM, Hybrid HMM/ANN and KL-HMM Approaches, , and , in: International Conference on Acoustics, Speech, and Signal Processing, Florence, IT, pages 7659 - 7663, IEEE, 2014 |
[DOI] |
On Modeling Glottal Source Information for Phonation Assessment in Parkinson’s Disease, , , , and , in: Proceedings of Interspeech, 2021 |
|
On Recognition of Non-Native Speech Using Probabilistic Lexical Model, and , in: Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), 2014 |
|
On the Adequacy of Baseform Pronunciations and Pronunciation Variants, and , Idiap-RR-27-2004 |
|
On the Application of Automatic Subword Unit Derivation and Pronunciation Generation for Under-Resourced Language ASR: A Study on Scottish Gaelic, , and , Idiap-RR-13-2015 |
|
On The Relationship Between Speech-based Breathing Signal Prediction Evaluation Measures And Breathing Parameters Estimation, , , , and , in: Proc. of ICASSP, 2021 |
|
P
Phase AutoCorrelation (PAC) features for noise robust speech recognition, , , and , in: Speech Communication, 54(7):867–880, 2012 |
[DOI] |
Phoneme based Respiratory Analysis of Read Speech, , , and , in: Proceedings of European Signal Processing Conference (EUSIPCO), 2021 |
|
Phoneme Recognition using Boosted Binary Features, , and , in: IEEE Intl. Conference on Acoustics, Speech and Signal Processing 2011, 2011 |
|
Phoneme vs Grapheme Based Automatic Speech Recognition, , , and , Idiap-RR-48-2004 |
|
Phoneme-Grapheme Based Speech Recognition System, , , and , in: Proceedings of IEEE ASRU, 2003 |
|
Phoneme-Grapheme Based Speech Recognition System, , , and , Idiap-RR-37-2003 |
|
Posterior Features Applied to Speech Recognition Tasks with Limited Training Data, , and , Idiap-RR-15-2008 |
|
Posterior features applied to speech recognition tasks with user-defined vocabulary, , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
|
Posterior Features for Template-based ASR, , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Prague, Czech Republic, 2011 |
|
Posterior-Based Multi-Stream Formulation To Combine Multiple Grapheme-to-Phoneme Conversion Techniques, and , Idiap-RR-33-2015 |
|
Presentation Attack Detection Using Long-Term Spectral Statistics for Trustworthy Speaker Verification, , and , in: International Conference of the Biometrics Special Interest Group (BIOSIG), 2016 |
|
Privacy-Sensitive Audio Features for Speech/Nonspeech Detection, , , and , Idiap-RR-12-2011 |
|
Privacy-Sensitive Audio Features for Speech/Nonspeech Detection, , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 19(8), 2011 |
|
Probabilistic Lexical Modeling and Grapheme-based Automatic Speech Recognition, and , Idiap-RR-15-2013 |
|
Probabilistic Lexical Modeling and Unsupervised Training for Zero-Resourced ASR, , and , in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, 2013 |
|
Probabilistic Symbol Sequence Matching and its Application to Pathological Speech Intelligibility Assessment, , and , Idiap-RR-01-2021 |
|
Pronunciation Lexicon Development for Under-Resourced Languages Using Automatically Derived Subword Units: A Case Study on Scottish Gaelic, , and , in: 4th Biennial Workshop on Less-Resourced Languages, 2015 |
|
Pronunciation models and their evaluation using confidence measures, and , Idiap-RR-29-2001 |
|
R
Raw Speech Signal-based Continuous Speech Recognition using Convolutional Neural Networks, , and , Idiap-RR-15-2014 |
|
Robust overlapping speech recognition based on neural networks, , and , Idiap-RR-55-2007 |
|
S
Segment-level training of ANNs based on acoustic confidence measures for hybrid HMM/ANN Speech Recognition, and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
|
Signal-to-signal neural networks for improved spike estimation from calcium imaging data, , , and , in: PLoS Computational Biology, 17(3):1--19, 2021 |
[DOI] |
SMILE Swiss German Sign Language Dataset, , , , , , , , , , , and , in: Language Resources and Evaluation Conference, 2018 |
Speaker Change Detection with Privacy-Preserving Audio Cues, , , and , Idiap-RR-23-2009 |
|
Speaker Change Detection with Privacy-Preserving Audio Cues, , , and , in: Proceedings of ICMI-MLMI 2009, 2009 |
|
Spectro-Temporal Activity Pattern (STAP) Features for Noise Robust ASR, , , and , in: Proceedings of the INTERSPEECH-ICSLP-04, 2004 |
|
Spectro-Temporal Activity Pattern (STAP) Features for Noise Robust ASR, , , and , Idiap-RR-20-2004 |
|
Speech Processing, , in: Interactive Multimodal Information Management, pages 221--245, EPFL Press, 2013 |
Speech recognition of spontaneous, noisy speech using auxiliary information in Bayesian networks, , and , in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-03), 2003 |
Speech recognition of spontaneous, noisy speech using auxiliary information in Bayesian networks, , and , Idiap-RR-44-2002 |
Speech recognition with auxiliary information, , and , in: IEEE Trans. on Speech and Audio Processing, 4, 2004 |
Speech recognition with auxiliary information, , and , Idiap-RR-58-2002 |
Subunits Inference and Lexicon Development Based on Pairwise Comparison of Utterances and Signs, and , in: Information, 10:298, 2019 |
[DOI] [URL] |
SYLLABLE LEVEL FEATURES FOR PARKINSON'S DISEASE DETECTION FROM SPEECH, and , in: ICASSP, 2024 |
Synthetic References for Template-based ASR using Posterior Features, , and , in: Proceedings of Interspeech, Portland, Oregon, USA, 2012 |
|
T
Template-based ASR using Posterior features and synthetic references: comparing different TTS systems, , and , in: SAPA-SCALE Conference, International Speech Communication Association, 2012 |
|
Threshold Selection for Unsupervised Detection, with an Application to Microphone Arrays, , and , in: Proceedings of ICASSP 2006, 2006 |
|