Keywords:
- acoustic modeling
- Adaboost
- Alzheimer's disease
- Anti-spoofing
- articulatory features
- Artificial Neural Networks
- atypical speech
- Automatic accent assessment
- Automatic accent evaluation
- automatic gender recognition
- Automatic speaker verification (ASV)
- Automatic Speech Recognition
- automatic subword unit derivation
- bag of audio words
- bandwidth
- Binary features
- binary masking
- bioacoustics
- BoAW
- boosting
- breathing pattern estimation
- breathing patterns
- call type classification
- call-type and caller classification
- Children speech recognition
- Classification
- CNN visualization
- ComParE features
- computational efficiency
- Conditional Random Fields
- confidence measures
- continuous speech recognition boosted binary features resource management
- Convolution Neural Network
- Convolutional neural network
- Convolutional Neural Networks
- COVID-19 identification
- cross-database
- cross-transfer knowledge
- Customer satisfaction
- deep learning
- deep neural networks
- depression detection
- Direction of arrival estimation
- dynamic programming
- Dysarthria
- Dysarthric speech
- embedding
- Emotion Recognition
- end-to-end acoustic modeling
- End-to-end learning
- end-to-end modelling
- end-to-end training
- expected performance and spoofability curve
- Expressive Vocalizations
- feature representations
- feature selection
- Few-shot learning
- fine-tuning
- fixed-size word patterns
- Formant identification
- Formants
- Foundation Model
- Fundamental frequency
- Fusion
- Gaussian mixture
- glottal source signals.
- grapheme
- Grapheme subword units
- grapheme subwords
- grapheme-to- phoneme conversion
- grapheme-to-phoneme conversion
- grapheme-to-phoneme converter
- Graphemes
- Hidden Markov Model
- hidden Markov models
- human skeleton estimation
- human speech
- integration of ASV and anti-spoofing
- Inter-pretable Models
- isolated word recognition
- Kalman filters
- KL-divergence
- KL-HMM
- Kullback-Leibler divergence
- Kullback-Leibler divergence based hidden Markov model
- Kullback-Leibler divergence based HMM
- Kullback–Leibler divergence based hidden Markov model
- language disorder
- Language Production
- Large Language Models
- letter-to-sound rules
- lexical model
- Lexical modeling
- Lexicon
- local posterior probability
- localization
- long-term statistics
- LoRA
- low level descriptors
- Mental Lexicon
- microphone array
- microphone arrays
- mobile biometrics
- modalities fusion
- modified ZFF
- multi- layer perceptron
- Multi-modal Approach
- multi-stream combination
- Multi-task learning
- multilayer perceptron
- multilayer perceptron network
- multilingual acoustic modeling
- multiple linear regression
- Multiple speaker localization
- multiple speakers
- multiple-stream combination
- multitask learning
- neural network
- neurocomputational models
- Noise Robustness
- non-native speech
- non-native speech recognition
- Objective Evaluation
- Objective intelligibility
- Objective intelligibility Assessment
- objective measures
- overlapping speech recognition
- Paralinguistic speech processing
- Parkinson's disease
- Parkinson's disease detection
- Parkinson’s disease
- parts-based approach
- Pathological speech
- Pathological Speech Processing
- Peft
- Perceived fluency
- phoneme
- phoneme modeling
- Phoneme recognition
- phoneme subword units
- phoneme subwords
- phonemes
- Phonetic information
- phonetic representation
- Phonocardiogram
- Posterior features
- posterior probabilities
- pre-trained embedding
- pre-training domain
- predictive coding
- presentation attack
- Presentation Attack Detection
- probabilistic lexical modeling
- pronunciation generation
- pronunciation lexicon
- Raw Speech
- raw waveform modelling
- raw waveforms
- raw-waveform cnn
- Reading Assessment
- recognition
- recurrent neural network
- Respiratory parameters
- S1-S2 detection
- Scottish Gaelic
- segment-level training.
- Self-Organizing Maps
- Self-supervised embedding
- self-supervised learning
- sign language assessment
- Sign language processing
- signal processing
- sleepiness
- speaker verification
- speaker-specific features
- spectral statistics
- Speech Analysis
- speech and audio
- speech assessment
- Speech breathing
- Speech Emotion Recognition
- Speech enhancement
- Speech for health
- Speech intelligibility
- speech pathology detection
- speech recognition
- speech recognition.
- speech separation
- speech synthesis
- Speech technology
- Spoken Language Understanding
- Spoofing
- spoofing detection
- Steered response power
- String matching
- SVM
- syllable-level-features
- syllables
- synthetic reference templates.
- Synthetic speech
- TANDEM features
- template-based approach
- template-based system
- Text classification
- text-to-speech synthesis
- tracking
- under-resource speech recognition
- under-resourced languages
- universal phoneme set
- unsupervised adaptation
- utterance verification
- voice activity detection
- Voice Conversion
- zero frequency filter
- Zero frequency filtering
- zero-frequency filtering
- zero-resourced speech recognition
Publications of Mathew Magimai.-Doss sorted by journal and type
International Multimodal Sentiment Analysis Workshop and Challenge (2022)
Comparing Biosignal and Acoustic feature Representation for Continuous Emotion Recognition, , , , and , in: International Multimodal Sentiment Analysis Workshop and Challenge, 2022 |
|
Proceedings of the ICML Expressive Vocalizations Workshop held in conjunction with the 39th International Conference on Machine Learning (2022)
Comparing supervised and self-supervised embedding for ExVo Multi-Task learning track, , , and , in: Proceedings of the ICML Expressive Vocalizations Workshop held in conjunction with the 39th International Conference on Machine Learning, Maryland, USA, 2022 |
|
Proceedings of ICASSP (2022)
Modeling Of Pre-trained Neural Network Embeddings Learned From Raw Waveform For Covid-19 Infection Detection, , , and , in: Proceedings of ICASSP, 2022 |
|
Proceedings of Interspeech (2022)
On Breathing Pattern Information in Synthetic Speech, and , in: Proceedings of Interspeech, 2022 |
|
ACM International Conference on Multimodal Interaction (2022)
Towards Accessible Sign Language Learning and Assessment, , , and , in: ACM International Conference on Multimodal Interaction, Bangalore, INDIA, pages 626-631, 2022 |
[DOI] |
ACM International Conference on Multimodal Interaction (ICMI Companion) (2022)
Towards Automatic Prediction of Non-Expert Perceived Speech Fluency Ratings, , , and , in: ACM International Conference on Multimodal Interaction (ICMI Companion), 2022 |
[DOI] |
Proceedings of Interspeech (2022)
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering, , and , in: Proceedings of Interspeech, 2022 |
|
Proceedings of ITG Conference on Speech Communication (2021)
An Objective Evaluation Framework for Pathological Speech Synthesis, , , , , and , in: Proceedings of ITG Conference on Speech Communication, 2021 |
|
Proceedings of the 2021 International Conference on Multimodal Interaction (2021)
Approximating the Mental Lexicon from Clinical Interviews as a Support Tool for Depression Detection, , , , and , in: Proceedings of the 2021 International Conference on Multimodal Interaction, ACM, 2021 |
[DOI] |
2nd Multimodal Sentiment Analysis Challenge (MuSe '21), October 24, 2021, Virtual Event, China (2021)
Fusion of Acoustic and Linguistic Information Using Supervised Autoencoder for Improved Emotion Recognition, , and , in: 2nd Multimodal Sentiment Analysis Challenge (MuSe '21), October 24, 2021, Virtual Event, China, 2021 |
[DOI] |
Proceedings of Interspeech (2021)
Handling acoustic variation in dysarthric speech recognition systems through model combination, and , in: Proceedings of Interspeech, 2021 |
|
Identification of F1 and F2 in speech using modified zero frequency filtering, and , in: Proceedings of Interspeech, 2021 |
|
Proceedings of Interspeech 2021 (2021)
Late Fusion of the Available Lexicon and Raw Waveform-based Acoustic Modeling for Depression and Dementia Recognition, , , , , and , in: Proceedings of Interspeech 2021, ISCA-International Speech Communication Association 2021, 2021 |
|
Proceedings of Interspeech (2021)
On Modeling Glottal Source Information for Phonation Assessment in Parkinson’s Disease, , , , and , in: Proceedings of Interspeech, 2021 |
|
Proc. of ICASSP (2021)
On The Relationship Between Speech-based Breathing Signal Prediction Evaluation Measures And Breathing Parameters Estimation, , , , and , in: Proc. of ICASSP, 2021 |
|
Proceedings of European Signal Processing Conference (EUSIPCO) (2021)
Phoneme based Respiratory Analysis of Read Speech, , , and , in: Proceedings of European Signal Processing Conference (EUSIPCO), 2021 |
|
Proceedings of Interspeech (2020)
A Comparison of Acoustic and Linguistics Methodologies for Alzheimer's Dementia Recognition, , , , , , , , , , and , in: Proceedings of Interspeech, pages 2182-2186, 2020 |
|
Companion Publication of the 2020 International Conference on Multimodal Interaction (ICMI '20 Companion) (2020)
A Phonology-based Approach for Isolated Sign Production Assessment in Sign Language, , , and , in: Companion Publication of the 2020 International Conference on Multimodal Interaction (ICMI '20 Companion), 2020 |
|
Proceedings of the International Conference on Language Resources and Evaluation LREC 2020 (2020)
An HMM Approach with Inherent Model Selection for Sign Language and Gesture Recognition, , and , in: Proceedings of the International Conference on Language Resources and Evaluation LREC 2020, 2020 |
|
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2020)
Detection of S1 and S2 locations in phonocardiogram signals using zero frequency filter, , , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 |
|
International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2020)
Dysarthric Speech Recognition with Lattice-Free MMI, and , in: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 6109-6113, 2020 |
[DOI] [URL] |
Estimating The Degree of Sleepiness by Integrating Articulatory Feature Knowledge In Raw Waveform Based CNNs, , and , in: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, 2020 |
|
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2020)
Towards Multilingual Sign Language Recognition, , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 |
|
Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2019)
HMM-based Approaches to Model Multichannel Information in Sign Language inspired from Articulatory Features-based Speech Processing, , , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
|
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2019)
Improving Children Speech Recognition through Feature Learning from Raw Speech Signal, , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
|
Learning voice source related information for depression detection, , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
|
Segment-level training of ANNs based on acoustic confidence measures for hybrid HMM/ANN Speech Recognition, and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
|
Proceedings of Interspeech (2019)
Understanding and Visualizing Raw Waveform-based CNNs, , , and , in: Proceedings of Interspeech, 2019 |
|
Using Speech Production Knowledge for Raw Waveform Modelling based Styrian Dialect Identification, and , in: Proceedings of Interspeech, 2019 |
|
Proceedings of Interspeech 2018 (2018)
Denoising and Raw-waveform Networks for Weakly-Supervised Gender Identification on Noisy Speech, , , , , and , in: Proceedings of Interspeech 2018, Hyderabad, INDIA, pages 292-296, 2018 |
[DOI] |
Implementing Fusion Techniques for the Classification of Paralinguistic Information, , , and , in: Proceedings of Interspeech 2018, pages 526-530, 2018 |
|
Proceedings of Interspeech (2018)
On Learning to Identify Genders from Raw Speech Signal Using CNNs, , and , in: Proceedings of Interspeech, Hyderabad, INDIA, pages 287-291, 2018 |
[DOI] |
On Learning Vocal Tract System Related Speaker Discriminative Information from Raw Signal Using CNNs, , and , in: Proceedings of Interspeech, Hyderabad, INDIA, pages 1116-1120, 2018 |
|
Language Resources and Evaluation Conference (2018)
SMILE Swiss German Sign Language Dataset, , , , , , , , , , , and , in: Language Resources and Evaluation Conference, 2018 |
IEEE International Conference on Acoustics, Speech and Signal Processing (2018)
Towards directly modeling raw speech signal for speaker verification using CNNs, , and , in: IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, CANADA, pages 4884-4888, 2018 |
|
International Joint Conference on Biometrics (2017)
End-to-End Convolutional Neural Network-based Voice Presentation Attack Detection, , and , in: International Joint Conference on Biometrics, Denver, Colorado, USA, 2017 |
|
Proceedings of Interspeech (2016)
HMM-based Non-native Accent Assessment using Posterior Features, , and , in: Proceedings of Interspeech, San Francisco, USA, 2016 |
|
Improving Under-Resourced Language ASR Through Latent Subword Unit Space Discovery, and , in: Proceedings of Interspeech, 2016 |
|
International Conference of the Biometrics Special Interest Group (BIOSIG) (2016)
Presentation Attack Detection Using Long-Term Spectral Statistics for Trustworthy Speaker Verification, , and , in: International Conference of the Biometrics Special Interest Group (BIOSIG), 2016 |
|
International Conference on Acoustics, Speech and Signal Processing (2015)
An HMM-Based Formalism for Automatic Subword Unit Derivation and Pronunciation Generation, and , in: International Conference on Acoustics, Speech and Signal Processing, pages 4639-4643, IEEE, 2015 |
[DOI] |
Proceedings of Interspeech (2015)
Analysis of CNN-based Speech Recognition System using Raw Speech as Input, , and , in: Proceedings of Interspeech, ISCA, Dresden, pages 11-15, ISCA, 2015 |
|
Automatic Accentedness Evaluation of Non-Native Speech Using Phonetic and Sub-Phonetic Posterior Probabilities, , , and , in: Proceedings of Interspeech, 2015 |
|
International Conference on Acoustics, Speech and Signal Procecssing (2015)
Convolutional Neural Networks-based Continuous Speech Recognition using Raw Speech Signal, , and , in: International Conference on Acoustics, Speech and Signal Procecssing, IEEE, South Brisbane, QLD, pages 4295 - 4299, IEEE, 2015 |
|
International Conference on Acoustics, Speech and Signal Processing (2015)
Integrated Pronunciation Learning for Automatic Speech Recognition Using Probabilistic Lexical Modeling, , and , in: International Conference on Acoustics, Speech and Signal Processing, South Brisbane, QLD, pages 5176-5180, 2015 |
[DOI] |
Proceedings of Interspeech (2015)
Objective Intelligibility Assessment of Text-to-Speech Systems Through Utterance Verification, , , and , in: Proceedings of Interspeech, Dresden, Germany, pages 3501-3505, 2015 |
[URL] |
40th IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) (2015)
Objective Speech Intelligibility Assessment through Comparison of Phoneme Class Conditional Probability Sequences, , and , in: 40th IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, pages 4924-4928, 2015 |
[DOI] |
4th Biennial Workshop on Less-Resourced Languages (2015)
Pronunciation Lexicon Development for Under-Resourced Languages Using Automatically Derived Subword Units: A Case Study on Scottish Gaelic, , and , in: 4th Biennial Workshop on Less-Resourced Languages, 2015 |
|
Global Conference on Signal and Information Processing (2014)
Joint Phoneme Segmentation Inference and Classification using CRFs, , and , in: Global Conference on Signal and Information Processing, Atlanta, GA, pages 587 - 591, IEEE, 2014 |
[DOI] |
International Conference on Acoustics, Speech, and Signal Processing (2014)
On Modeling Context-Dependent Clustered States: Comparing HMM/GMM, Hybrid HMM/ANN and KL-HMM Approaches, , and , in: International Conference on Acoustics, Speech, and Signal Processing, Florence, IT, pages 7659 - 7663, IEEE, 2014 |
[DOI] |
Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014) (2014)
On Recognition of Non-Native Speech Using Probabilistic Lexical Model, and , in: Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), 2014 |
|