Keywords:
- Age prediction
- Aho-Corasick algorithm
- Alzheimer's disease
- ASR
- ASR robustness
- Author profiling
- Automatic Depression Detection
- automatic speech recognition (ASR)
- Bag OF Words
- Benchmarking
- BERT
- BERTopic
- bias
- bias aware
- Clinical Interviews
- Computational linguistics
- Contextual Adaptation
- Contextual Entrepreneurship
- Contextualisation and adaptation of ASR
- conversational AI
- Cross-modal Alignment
- Cross-modal Attentio
- Cross-modal Attention
- Data Mining
- Data Selection
- dataset
- deep learning
- deep learning models
- Depression Corpora
- depression detection
- dialogue simulation
- domain adaptation
- Domain Classification
- Dual mode encoder
- Early Risk Detection
- embedding
- Etrepreneurial Challenges
- explainability
- F1 score
- fact checking
- factual reporting
- finite-state transducers
- Foundation Models
- Gender prediction
- GPU decoding
- Graph Convolutional Network (GCN)
- Graph Convolutional Networks
- Graph Neural Networks
- health care
- Human-Computer Interaction
- Industry Context
- information verification
- Intelligent Systems
- Intent Classification
- Inter-pretable Models
- Interpretability
- Interpretable Models
- knowledge distillation
- language identification
- Language Production
- Large Language Models
- limited training data
- LLM
- LLM-based ASR
- Location prediction
- low-resource domains
- machine learning
- media bias
- Medical Sector
- Mental Health
- Mental Lexicon
- Mexican Tourist Text
- Multi-modal Approach
- multimodal analysis
- Multimodal classification
- multitask learning
- multitask training
- Nahuatl and Spanish utterances
- named entity recognition
- natural language porcessing
- Natural language processing
- Natural Language Understanding
- news media
- node weighted graphs
- Occupation prediction
- online speech recognition
- Operant Motive Test
- orchestration
- personas
- prompt projection
- pseudo-labelling
- Psycholinguistics
- Raw Speech
- real-time ASR
- real-time speech recognition
- Recommendation System
- reinforcement learning
- reliability estimation
- reproducibility
- resources and evaluation
- scenario management
- self-supervised learning
- Sentiment Analysis
- Service Robots
- Sexual predators identification
- shallow fusion
- slot filling
- social media analysis
- Speaker change detection
- speaker turn detection
- speech recognition
- Speech-to-LLM alignment
- speech-to-text alignment
- Spoken Language Understanding
- streaming ASR
- streaming transducer
- Supervised Autoencoders
- synthetic dialogue
- Text Analysis
- Text classification
- text denoising
- Text fine-tuning
- Text Information Organization Schemes
- Text Mining
- Text Representation
- topic detection
- topic modeling
- transformer transducer
- transformers
- whisper
- Word Consensus Networks
- Word-Confusion-Networks
- XLSR
- XLSR-Transducer
- Zipformer
Publications of Esaú Villatoro-Tello sorted by first author
| 1 | 2 |
A
| Author Profiling in Social Media with Multimodal Information., , , and , in: In Journal of Computacion y Sistemas (CyS), 24(3), 2020 |
[URL] |
| Classifying the Social Media Author Profile Through a Multimodal Representation, , , and , in: Intelligent Technologies: Concepts, Applications, and Future Directions. Studies in Computational Intelligence, Springer, 2022 |
[DOI] [URL] |
B
| DAIC-WOZ: On the Validity of Using the Therapist's prompts in Automatic Depression Detection from Clinical Interviews, , , , , and , in: Proceedings of the 6th Clinical Natural Language Processing Workshop, Mexico City, Mexico, pages 82–90, Association for Computational Linguistics, 2024 |
[DOI] [URL] |
| Reliability Estimation of News Media Sources: Birds of a Feather Flock Together, , , and , in: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Mexico City, Mexico, pages 6900–6918, Association for Computational Linguistics, 2024 |
[DOI] [URL] |
| Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews, , , and , Idiap-RR-03-2023 |
|
| Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews, , , and , in: Proceedings of Interspeech, 2023 |
|
| IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach, , , , , , and , Idiap-RR-13-2022 |
|
| IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach, , , , , , and , in: The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ EMNLP 2022), 2022 |
[URL] |
C
| The Winning Approach for the Recommendation Systems Shared Task @REST_MEX 2022, , , , and , in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2022), 2022 |
[URL] |
| Analysis of Vector Representations in Maintenance Logs in the Industry: Towards an Information Retrieval System, , , and , in: Journal of Research in Computing Science, 2021 |
| Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering, , , , , , , , , , , , and , in: Interspeech 2025, Rotterdam, The Netherlands, pages 3618--3622, 2025 |
[DOI] [URL] |
D
| Natural Language Processing in Healthcare, , , , and , Taylor & Francis Groups, 2022 |
[DOI] [URL] |
| Intelligent Technologies: Concepts, Applications, and Future Directions, Volume 2, and , Springer, volume 1098, 2023 |
[DOI] |
F
| BertAA: BERT fine-tuning for Authorship Attribution, , , and , in: Proceedings of the 17th International Conference on Natural Language Processing, 2020 |
|
| IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model, , , , , , and , Idiap-RR-12-2022 |
|
| IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model, , , , , , and , in: The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ EMNLP 2022), 2022 |
[URL] |
H
| Natural Language Understanding for Navigation of Service Robots in Low-Resource Domains and Languages: Scenarios in Spanish and Nahuatl, , , , and , in: Mathematics, 12(8), 2024 |
[DOI] [URL] |
| Sentiment Analysis using pretrained LLMs, , and , Idiap-RR-05-2024 |
|
I
| The Greatest Challenge For Startups: Computational Text Analysis on Swiss Ventures, , , , and , in: Academy of Management Proceedings 2025., 2025 |
[URL] |
| Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , Idiap-RR-02-2023 |
|
| Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , in: Proc. Interspeech 2023, pages 4494--4498, 2023 |
[DOI] [URL] |
| Unifying Global and Near-Context Biasing in a Single Trie Pass., , , , , , , , , , , and , in: Text, Speech, and Dialogue. TSD 2025. Lecture Notes in Computer Science, Springer, Springer, 2025 |
[DOI] [URL] |
| Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper, , , , , , , , and , Idiap-RR-10-2024 |
|
K
| Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers, , , , , , , and , in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Republic of Korea, pages 12592-12596, IEEE, 2024 |
[DOI] [URL] |
| TokenVerse++: Towards Flexible Multitask Learning with Dynamic Task Activation, , , , , , , , , , and , in: 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, 2025 |
|
| TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , Idiap-RR-07-2024 |
[URL] |
| TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20988–20995, Association for Computational Linguistics (ACL), 2024 |
[DOI] [URL] |
| XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models, , , , , , , and , Idiap-RR-08-2024 |
[URL] |
| XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models, , , , , , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, IEEE, 2025 |
[DOI] [URL] |
| Performance Evaluation of SLAM-ASR: The Good, the Bad, the Ugly, and the Way Forward, , , , , , , , , and , in: SALMA Workshop, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, IEEE, 2025 |
[URL] |
P
| BertOdia: BERT pre-training for low resource Odia language, , , , , and , Idiap-RR-16-2021 |
|
| Open Machine Translation for Low Resource South American Languages (AmericasNLP 2021 Shared Task Contribution), , , , , , , , and , Idiap-RR-07-2021 |
|
| Open Machine Translation for Low Resource South American Languages (AmericasNLP 2021 Shared Task Contribution), , , , , , , , and , in: Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, pages 218–223, Association for Computational Linguistics, 2021 |
[DOI] [URL] |
| Detection of Similar Languages and Dialects Using Deep Supervised Autoencoders, , , , and , in: Proceedings of the 17th International Conference on Natural Language Processing, 2020 |
|
| Idiap Submission to Swiss-German Language Detection Shared Task, , , , and , Idiap-RR-11-2020 |
|
| Idiap Submission to Swiss-German Language Detection Shared Task, , , , and , in: Proceedings of the 5th Swiss Text Analytics Conference (SwissText) & 16th Conference on Natural Language Processing (KONVENS), CEUR Workshop Proceedings, 2020 |
[URL] |
| 1 | 2 |