Keywords:
- acoustic modeling
- Adaboost
- Alzheimer's disease
- Anti-spoofing
- articulatory features
- Artificial Neural Networks
- atypical speech
- Automatic accent assessment
- Automatic accent evaluation
- automatic gender recognition
- Automatic speaker verification (ASV)
- Automatic Speech Recognition
- automatic subword unit derivation
- bag of audio words
- bandwidth
- Binary features
- binary masking
- bioacoustics
- BoAW
- boosting
- breathing pattern estimation
- breathing patterns
- call type classification
- call-type and caller classification
- Children speech recognition
- Classification
- CNN visualization
- ComParE features
- computational efficiency
- Conditional Random Fields
- confidence measures
- continuous speech recognition boosted binary features resource management
- Convolution Neural Network
- Convolutional neural network
- Convolutional Neural Networks
- COVID-19 identification
- cross-database
- cross-transfer knowledge
- Customer satisfaction
- deep learning
- deep neural networks
- depression detection
- Direction of arrival estimation
- dynamic programming
- Dysarthria
- Dysarthric speech
- embedding
- Emotion Recognition
- end-to-end acoustic modeling
- End-to-end learning
- end-to-end modelling
- end-to-end training
- expected performance and spoofability curve
- Expressive Vocalizations
- feature representations
- feature selection
- Few-shot learning
- fine-tuning
- fixed-size word patterns
- Formant identification
- Formants
- Foundation Model
- Fundamental frequency
- Fusion
- Gaussian mixture
- glottal source signals.
- grapheme
- Grapheme subword units
- grapheme subwords
- grapheme-to- phoneme conversion
- grapheme-to-phoneme conversion
- grapheme-to-phoneme converter
- Graphemes
- Hidden Markov Model
- hidden Markov models
- human skeleton estimation
- human speech
- integration of ASV and anti-spoofing
- Inter-pretable Models
- isolated word recognition
- Kalman filters
- KL-divergence
- KL-HMM
- Kullback-Leibler divergence
- Kullback-Leibler divergence based hidden Markov model
- Kullback-Leibler divergence based HMM
- Kullback–Leibler divergence based hidden Markov model
- language disorder
- Language Production
- Large Language Models
- letter-to-sound rules
- lexical model
- Lexical modeling
- Lexicon
- local posterior probability
- localization
- long-term statistics
- LoRA
- low level descriptors
- Mental Lexicon
- microphone array
- microphone arrays
- mobile biometrics
- modalities fusion
- modified ZFF
- multi- layer perceptron
- Multi-modal Approach
- multi-stream combination
- Multi-task learning
- multilayer perceptron
- multilayer perceptron network
- multilingual acoustic modeling
- multiple linear regression
- Multiple speaker localization
- multiple speakers
- multiple-stream combination
- multitask learning
- neural network
- neurocomputational models
- Noise Robustness
- non-native speech
- non-native speech recognition
- Objective Evaluation
- Objective intelligibility
- Objective intelligibility Assessment
- objective measures
- overlapping speech recognition
- Paralinguistic speech processing
- Parkinson's disease
- Parkinson's disease detection
- Parkinson’s disease
- parts-based approach
- Pathological speech
- Pathological Speech Processing
- Peft
- Perceived fluency
- phoneme
- phoneme modeling
- Phoneme recognition
- phoneme subword units
- phoneme subwords
- phonemes
- Phonetic information
- phonetic representation
- Phonocardiogram
- Posterior features
- posterior probabilities
- pre-trained embedding
- pre-training domain
- predictive coding
- presentation attack
- Presentation Attack Detection
- probabilistic lexical modeling
- pronunciation generation
- pronunciation lexicon
- Raw Speech
- raw waveform modelling
- raw waveforms
- raw-waveform cnn
- Reading Assessment
- recognition
- recurrent neural network
- Respiratory parameters
- S1-S2 detection
- Scottish Gaelic
- segment-level training.
- Self-Organizing Maps
- Self-supervised embedding
- self-supervised learning
- sign language assessment
- Sign language processing
- signal processing
- sleepiness
- speaker verification
- speaker-specific features
- spectral statistics
- Speech Analysis
- speech and audio
- speech assessment
- Speech breathing
- Speech Emotion Recognition
- Speech enhancement
- Speech for health
- Speech intelligibility
- speech pathology detection
- speech recognition
- speech recognition.
- speech separation
- speech synthesis
- Speech technology
- Spoken Language Understanding
- Spoofing
- spoofing detection
- Steered response power
- String matching
- SVM
- syllable-level-features
- syllables
- synthetic reference templates.
- Synthetic speech
- TANDEM features
- template-based approach
- template-based system
- Text classification
- text-to-speech synthesis
- tracking
- under-resource speech recognition
- under-resourced languages
- universal phoneme set
- unsupervised adaptation
- utterance verification
- voice activity detection
- Voice Conversion
- zero frequency filter
- Zero frequency filtering
- zero-frequency filtering
- zero-resourced speech recognition
Publications of Mathew Magimai.-Doss sorted by journal and type
Publications of type Idiap-RR
2024
Estimating Breathing Pattern from Raw Speech Waveform and Short-term Speech Spectrum using Neural Networks, , , , and , Idiap-RR-12-2024 |
|
Feature Representations for Automatic Meerkat Vocalization Classification, , , and , Idiap-RR-06-2024 |
|
Posterior-based analysis of spatio-temporal features for Sign Language Assessment, , , , and , Idiap-RR-11-2024 |
|
Towards Dynamic Skeleton-based Handshape Subunits for Sign Language Assessment, and , Idiap-RR-09-2024 |
|
2023
Idiap Scientific Report 2022, , , , , , , , , , , , , , , , , and , Idiap-RR-05-2023 |
|
2021
Adjustable Deterministic Pseudonymization of Speech, , and , Idiap-RR-12-2021 |
|
Approximating the Mental Lexicon from Clinical Interviews as a Support Tool for Depression Detection, , , , and , Idiap-RR-19-2021 |
Late Fusion of the Available Lexicon and Raw Waveform-based Acoustic Modeling for Depression and Dementia Recognition, , , , , and , Idiap-RR-09-2021 |
|
Probabilistic Symbol Sequence Matching and its Application to Pathological Speech Intelligibility Assessment, , and , Idiap-RR-01-2021 |
|
Towards Automatic Prediction of Non-Expert Perceived Speech Fluency Ratings, , , and , Idiap-RR-11-2021 |
|
2019
Data-Driven Movement Subunit Extraction from Skeleton Information for Modeling Signs and Gestures, , and , Idiap-RR-02-2019 |
|
Domain Adaptation and Investigation of Robustness of DNN-based Embeddings for Text-Independent Speaker Verification Using Dilated Residual Networks, , and , Idiap-RR-10-2019 |
|
Estimating The Degree of Sleepiness by Integrating Articulatory Feature Knowledge In Raw Waveform Based CNNs, , and , Idiap-RR-06-2020 |
|
TOWARDS MULTILINGUAL SIGN LANGUAGE RECOGNITION, , and , Idiap-RR-16-2019 |
|
Understanding Raw Waveform based CNN through Low-rank Spectro-Temporal Decoupling, , and , Idiap-RR-11-2019 |
|
2018
Gradient-based spectral visualization of CNNs using raw waveforms, , , and , Idiap-RR-11-2018 |
|
Modelling glottal source information for depression detection, , and , Idiap-RR-13-2018 |
|
2017
Long Term Spectral Statistics for Voice Presentation Attack Detection, , , and , Idiap-RR-11-2017 |
|
Towards directly modeling raw speech signal for speaker verification using CNNs, , and , Idiap-RR-30-2017 |
|
Towards Weakly Supervised Acoustic Subword Unit Discovery and Lexicon Development Using Hidden Markov Models, , and , Idiap-RR-15-2017 |
|
2016
End-to-End Acoustic Modeling using Convolutional Neural Networks for Automatic Speech Recognition, , and , Idiap-RR-18-2016 |
|
2015
Acoustic Data-Driven Grapheme-to-Phoneme Conversion in the Probabilistic Lexical Modeling Framework, , and , Idiap-RR-10-2015 |
|
Analysis of CNN-based Speech Recognition System using Raw Speech as Input, , and , Idiap-RR-23-2015 |
|
Automatic Accentedness Evaluation of Non-Native Speech Using Phonetic and Sub-Phonetic Posterior Probabilities, , , and , Idiap-RR-12-2015 |
|
HMM-based Non-native Accent Assessment using Posterior Features, , and , Idiap-RR-32-2015 |
|
Learning linearly separable features for speech recognition using convolutional neural networks, , and , Idiap-RR-24-2015 |
[URL] |
Objective Intelligibility Assessment of Text-to-Speech Systems Through Utterance Verification, , , and , Idiap-RR-06-2015 |
|
On the Application of Automatic Subword Unit Derivation and Pronunciation Generation for Under-Resourced Language ASR: A Study on Scottish Gaelic, , and , Idiap-RR-13-2015 |
|
Posterior-Based Multi-Stream Formulation To Combine Multiple Grapheme-to-Phoneme Conversion Techniques, and , Idiap-RR-33-2015 |
|
Towards Multiple Pronunciation Generation in Acoustic G2P Conversion Framework, , and , Idiap-RR-34-2015 |
|
2014
Acoustic and Lexical Resource Constrained ASR using Language-Independent Acoustic Model and Language-Dependent Probabilistic Lexical Model, and , Idiap-RR-02-2014 |
|
Articulatory Feature based Continuous Speech Recognition using Probabilistic Lexical Modeling, and , Idiap-RR-19-2014 |
|
Convolutional Neural Networks-based Continuous Speech Recognition using Raw Speech Signal, , and , Idiap-RR-18-2014 |
|
Feature Mapping of Multiple Beamformed Sources for Robust Overlapping Speech Recognition Using a Microphone Array, , , , , , and , Idiap-RR-17-2014 |
|
Objective Speech Intelligibility Assessment through Comparison of Phoneme Class Conditional Probability Sequences, , and , Idiap-RR-16-2014 |
|
Raw Speech Signal-based Continuous Speech Recognition using Convolutional Neural Networks, , and , Idiap-RR-15-2014 |
|
2013
End-to-end Phoneme Sequence Recognition using Convolutional Neural Networks, , and , Idiap-RR-40-2013 |
|
Estimating Phoneme Class Conditional Probabilities from Raw Speech Signal using Convolutional Neural Networks, , and , Idiap-RR-13-2013 |
|