Keywords:
- adaptation
- ADS-B data
- Air traffic control
- air traffic control communications
- AM
- Anti-spoofing
- audio and voice analysis
- Automatic Speech Recognition
- automatic speech recognition (ASR)
- automatic speech recognition and understanding
- batch norm
- batch normalization
- bayesian fusion
- Call-sign Recognition
- Contextual Adaptation
- contextual biasing
- conversational modeling
- Convolutional Neural Networks
- Cross-modal Alignment
- Cross-modal Attentio
- Cross-modal Attention
- ctc
- Data Selection
- deep neural networks
- depression detection
- domain adaptation
- Domain Classification
- e2e-lfmmi
- entity linking
- F1 score
- fine-tuning
- finite-state transducers
- FM
- Forensics
- GPU decoding
- Graph Neural Networks
- Human-Computer Interaction
- i-vector
- i-vectors
- Intent Classification
- inter-task fusion
- Interpretability
- Interpretable Models
- knowledge distillation
- language identification
- Language Production
- LEA
- limited training data
- Linear prediction
- logistic regression
- low-resource
- Mental Lexicon
- multi-lingual automatic speech recognition
- multi-lingual SAD
- Multilingual automatic speech recognition
- multitask learning
- multitask training
- named entity recognition
- node weighted graphs
- online speech recognition
- OOV-word recognition
- OpenSky Network
- OSINT
- out-of-domain
- rare word recognition
- real-time speech recognition
- ROXANNE
- ROXSD
- sentence embeddings
- Speaker change detection
- speaker clustering
- Speaker identification
- speaker recognition
- speaker role detection
- speaker turn detection
- speaker verification
- Speech activity detection
- speech dataset
- speech recognition
- spoken dialogue systems
- Spoken Language Understanding
- subspace Gaussian mixture models
- supervised adaptation
- task-oriented dialog
- Text classification
- transfer learning
- transformers
- user identity linkage
- wav2vec 2.0
- wav2vec2
- whisper
- Word Consensus Networks
- Word-Confusion-Networks
- XLSR-Transducer
- Zipformer
Publications of Srikanth Madikeri sorted by recency
INTRA-CLASS COVARIANCE ADAPTATION IN PLDA BACK-ENDS FOR SPEAKER VERIFICATION, , , and , Idiap-RR-05-2017 |
|
Analysis of Posterior Estimation Approaches to I-vector Extraction for Speaker Recognition, , , and , Idiap-RR-15-2018 |
|
Two-Pass IB based Speaker Diarization System using Meeting-Specific ANN based Features, , , and , in: Proceedings of Interspeech 2016, pages 2199-2203, 2016 |
Two-Pass IB based Speaker Diarization System using Meeting-Specific ANN based Features, , , and , Idiap-RR-09-2018 |
|
IDIAP SUBMISSION TO THE NIST SRE 2016 SPEAKER RECOGNITION EVALUATION, , , , and , Idiap-RR-32-2016 |
|
Implementation of the Standard I-vector System for the Kaldi Speech Recognition Toolkit, , , and , Idiap-RR-26-2016 |
|
Inter-task System Fusion for Speaker Recognition, , , , and , in: Proceeedings of the INTERSPEECH, 2016 |
|
A Large-Scale Open-Source Acoustic Simulator for Speaker Recognition, , , , and , in: IEEE Signal Processing Letters, 23(4):527 - 531, 2016 |
|
SYSTEM FUSION AND SPEAKER LINKING FOR LONGITUDINAL DIARIZATION OF TV SHOWS, , , and , in: Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016), Shanghai, pages 5495-5499, IEEE, 2016 |
|
INFORMATION THEORETIC CLUSTERING FOR UNSUPERVISED DOMAIN-ADAPTATION, , and , in: Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016), Shanghai, pages 5580-5584, IEEE, 2016 |
|
DEEP NEURAL NETWORK BASED POSTERIORS FOR TEXT-DEPENDENT SPEAKER VERIFICATION, , , and , in: Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016), Shanghai, pages 5050-5054, IEEE, 2016 |
|
INFORMATION THEORETIC CLUSTERING FOR UNSUPERVISED DOMAIN-ADAPTATION, , and , Idiap-RR-09-2016 |
|
DEEP NEURAL NETWORK BASED POSTERIORS FOR TEXT-DEPENDENT SPEAKER VERIFICATION, , , and , Idiap-RR-08-2016 |
|
Towards utterance-based neural network adaptation in acoustic modeling, , , and , in: IEEE Automatic Speech Recognition and Understanding Workshop, pages 289-295, 2015 |
|
Integrating Online I-vector extractor with Information Bottleneck based Speaker Diarization system, , , and , in: Proceedings of Interspeech 2015, pages 3105-3109, 2015 |
|
Integrating Online I-vector extractor with Information Bottleneck based Speaker Diarization system, , , and , Idiap-RR-20-2015 |
|
KL-HMM BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS, and , in: Proceedings of ICASSP 2015, pages 4435-4439, 2015 |
|
COMBINING SGMM SPEAKER VECTORS AND KL-HMM APPROACH FOR SPEAKER DIARIZATION, , and , in: Proceedings of ICASSP 2015, pages 4834-4837, 2015 |
|
EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION, , , and , in: 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, Brisbane, Australia, pages 4445-4449, 2015 |
[URL] |
EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION, , , and , Idiap-RR-16-2015 |
|
COMBINING SGMM SPEAKER VECTORS AND KL-HMM APPROACH FOR SPEAKER DIARIZATION, , and , Idiap-RR-17-2015 |
|
KL-HMM BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS, and , Idiap-RR-19-2015 |
|
MODIFIED GROUP DELAY FEATURE BASED TOTAL VARIABILITY SPACE MODELLING FOR SPEAKER RECOGNITION, , and , in: International Journal of Speech Techonology, 18(1):17-23, 2014 |
[DOI] |
Feature Switching in the i-vector Framework for Speaker Verification, , , , and , in: Proc. of Interspeech 2014, pages 5, 2014 |
Improving Real Time Factor of Information Bottleneck-based Speaker Diarization System, , and , Idiap-RR-18-2015 |
|