Keywords:
- adaptation
- ADS-B data
- Air traffic control
- air traffic control communications
- AM
- Anti-spoofing
- audio and voice analysis
- Automatic Speech Recognition
- automatic speech recognition (ASR)
- automatic speech recognition and understanding
- batch norm
- batch normalization
- bayesian fusion
- Call-sign Recognition
- Contextual Adaptation
- contextual biasing
- conversational modeling
- Convolutional Neural Networks
- Cross-modal Alignment
- Cross-modal Attentio
- Cross-modal Attention
- ctc
- Data Selection
- deep neural networks
- depression detection
- domain adaptation
- Domain Classification
- e2e-lfmmi
- entity linking
- F1 score
- fine-tuning
- finite-state transducers
- FM
- Forensics
- GPU decoding
- Graph Neural Networks
- Human-Computer Interaction
- i-vector
- i-vectors
- Intent Classification
- inter-task fusion
- Interpretability
- Interpretable Models
- knowledge distillation
- language identification
- Language Production
- LEA
- limited training data
- Linear prediction
- logistic regression
- low-resource
- Mental Lexicon
- multi-lingual automatic speech recognition
- multi-lingual SAD
- Multilingual automatic speech recognition
- multitask learning
- multitask training
- named entity recognition
- node weighted graphs
- online speech recognition
- OOV-word recognition
- OpenSky Network
- OSINT
- out-of-domain
- rare word recognition
- real-time speech recognition
- ROXANNE
- ROXSD
- sentence embeddings
- Speaker change detection
- speaker clustering
- Speaker identification
- speaker recognition
- speaker role detection
- speaker turn detection
- speaker verification
- Speech activity detection
- speech dataset
- speech recognition
- spoken dialogue systems
- Spoken Language Understanding
- subspace Gaussian mixture models
- supervised adaptation
- task-oriented dialog
- Text classification
- transfer learning
- transformers
- user identity linkage
- wav2vec 2.0
- wav2vec2
- whisper
- Word Consensus Networks
- Word-Confusion-Networks
- XLSR-Transducer
- Zipformer
Publications of Srikanth Madikeri sorted by journal and type
Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (2022)
Expanded Lattice Embeddings for Spoken Document Retrieval on Informal Meetings, , , , and , in: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, 2022 |
[DOI] |
The Speaker and Language Recognition Workshop (2022)
Speaker recognition on mono-channel telephony recordings, , , , and , in: The Speaker and Language Recognition Workshop, 2022 |
|
2021 IEEE International Conference on Acoustics, Speech and Signal Processing (2021)
A COMPARISON OF METHODS FOR OOV-WORD RECOGNITION ON A NEW PUBLIC DATASET, , and , in: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE Signal Processing Society, Toronto, Ontario, Canada, 2021 |
|
Proceedings of Interspeech (2021)
Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model, , and , in: Proceedings of Interspeech, 2021 |
[URL] |
1st ISCA Symposium on Security and Privacy in Speech Communication (2021)
Graph2Speak: Improving Speaker Identification using Network Knowledge in Criminal Conversational Data, , , and , in: 1st ISCA Symposium on Security and Privacy in Speech Communication, pages 10--13, 2021 |
[DOI] |
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (2021)
LATTICE-FREE MMI ADAPTATION OF SELF-SUPERVISED PRETRAINED ACOUSTIC MODELS, , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2021 |
[URL] |
Proceedings of Interspeech 2021 (2021)
Multitask adaptation with Lattice-Free MMI for multi-genre speech recognition of low resource languages, , and , in: Proceedings of Interspeech 2021, 2021 |
|
Interspeech (2021)
Speech Activity Detection Based on Multilingual Speech Recognition System, , and , in: Interspeech, 2021 |
|
Proceedings of ICASSP 2020 (2020)
INCREMENTAL SEMI-SUPERVISED LEARNING FOR MULTI-GENRE SPEECH RECOGNITION, , , , , and , in: Proceedings of ICASSP 2020, 2020 |
|
In Proceedings of Interspeech 2020 (2020)
Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition System, , , , , and , in: In Proceedings of Interspeech 2020, pages 4746--4750, ISCA, 2020 |
|
Interspeech (2020)
Supervised domain adaptation for text-independent speaker verification using limited data, , , and , in: Interspeech, pages 3815-3819, 2020 |
[URL] |
In Proceedings of ICASSP 2019 (2019)
A BAYESIAN APPROACH TO INTER-TASK FUSION FOR SPEAKER RECOGNITION, , and , in: In Proceedings of ICASSP 2019, Brighton, ENGLAND, pages 5786-5790, 2019 |
|
11th International workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (2019)
AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS, , , , and , in: 11th International workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, Universita Degli Studi Firenze, Firenze, Italy, 2019 |
[URL] |
Proceedings of ICASSP 2019 (2019)
INCREMENTAL TRANSFER LEARNING IN TWO-PASS INFORMATION BOTTLENECK BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS, , , and , in: Proceedings of ICASSP 2019, pages 6291-6295, 2019 |
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2019 (2019)
SARAL: A Low-Resource Cross-Lingual Domain-Focused Information Retrieval System for Effective Rapid Document Triage, , , , , , , , , , , and , in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2019, pages 19-24, 2019 |
Proceedings of Interspeech 2018 (2018)
Analysis of Language Dependent Front-End for Speaker Recognition, , and , in: Proceedings of Interspeech 2018, Hyderabad, INDIA, pages 1101-1105, 2018 |
[DOI] |
Proceedings of 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing (2018)
DNN based speaker embedding using content information for text-dependent speaker verification, , , and , in: Proceedings of 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2018 |
|
Proceedings of Interspeech 2018 (2018)
End-to-end text-dependent speaker verification using novel distance measures, , and , in: Proceedings of Interspeech 2018, Hyderabad, INDIA, Aug 02-Sep 06, 2018, pages 3598-3602, 2018 |
[DOI] |
Big Data and Artificial Intelligence for Military Decision Making (2018)
SIIP: An Innovative Speaker Identification Approach for Law Enforcement Agencies, , , , , , , , , and , in: Big Data and Artificial Intelligence for Military Decision Making, http://www.sto.nato.int/, pages PT-1 - 1: PT-1 - 14, STO, 2018 |
[DOI] [URL] |
Proc. of Interspeech (2017)
Content Normalization for Text-dependent Speaker Verification, , , and , in: Proc. of Interspeech, 2017 |
|
Proceedings of 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (2017)
EXPLOITING SEQUENCE INFORMATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION, , , and , in: Proceedings of 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, New Orleans, pages 5370-5374, 2017 |
|
Proceedings of International Conference on Acoustics, Speech and Signal Processing (2017)
INTRA-CLASS COVARIANCE ADAPTATION IN PLDA BACK-ENDS FOR SPEAKER VERIFICATION, , , and , in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, pages 5365-5369, 2017 |
[DOI] |
European Intelligence and Security Informatics Conference (EISIC) 2017 (2017)
Towards a breakthrough Speaker Identification approach for Law Enforcement Agencies: SIIP, , , , , , , , , and , in: European Intelligence and Security Informatics Conference (EISIC) 2017, Athenes, Greece, pages 32-39, IEEE Computer Society, 2017 |
[DOI] [URL] |
Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016) (2016)
DEEP NEURAL NETWORK BASED POSTERIORS FOR TEXT-DEPENDENT SPEAKER VERIFICATION, , , and , in: Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016), Shanghai, pages 5050-5054, IEEE, 2016 |
|
INFORMATION THEORETIC CLUSTERING FOR UNSUPERVISED DOMAIN-ADAPTATION, , and , in: Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016), Shanghai, pages 5580-5584, IEEE, 2016 |
|
Proceeedings of the INTERSPEECH (2016)
Inter-task System Fusion for Speaker Recognition, , , , and , in: Proceeedings of the INTERSPEECH, 2016 |
|
Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016) (2016)
SYSTEM FUSION AND SPEAKER LINKING FOR LONGITUDINAL DIARIZATION OF TV SHOWS, , , and , in: Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016), Shanghai, pages 5495-5499, IEEE, 2016 |
|
Proceedings of Interspeech 2016 (2016)
Two-Pass IB based Speaker Diarization System using Meeting-Specific ANN based Features, , , and , in: Proceedings of Interspeech 2016, pages 2199-2203, 2016 |
Proceedings of ICASSP 2015 (2015)
COMBINING SGMM SPEAKER VECTORS AND KL-HMM APPROACH FOR SPEAKER DIARIZATION, , and , in: Proceedings of ICASSP 2015, pages 4834-4837, 2015 |
|
2015 IEEE International Conference on Acoustics, Speech, and Signal Processing (2015)
EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION, , , and , in: 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, Brisbane, Australia, pages 4445-4449, 2015 |
[URL] |
Proceedings of Interspeech 2015 (2015)
Integrating Online I-vector extractor with Information Bottleneck based Speaker Diarization system, , , and , in: Proceedings of Interspeech 2015, pages 3105-3109, 2015 |
|
Proceedings of ICASSP 2015 (2015)
KL-HMM BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS, and , in: Proceedings of ICASSP 2015, pages 4435-4439, 2015 |
|
IEEE Automatic Speech Recognition and Understanding Workshop (2015)
Towards utterance-based neural network adaptation in acoustic modeling, , , and , in: IEEE Automatic Speech Recognition and Understanding Workshop, pages 289-295, 2015 |
|
Proc. of Interspeech 2014 (2014)
Feature Switching in the i-vector Framework for Speaker Verification, , , , and , in: Proc. of Interspeech 2014, pages 5, 2014 |