Keywords:
- acoustic modeling
- AMI Meetings
- Automatic Speech Recognition
- Channel selection
- Confidence Measure (CM)
- far-field speech
- information bottleneck
- Information Bottleneck clustering
- KeyWord Spotting (KWS)
- Language IDentification (LID)
- Large Vocabulary Continuous Speech Recognition (LVCSR)
- LVCSR
- meetings
- mutual information
- Out- Of-Language (OOL) detection
- Out-Of-Language (OOL) detection
- Overlap speech
- Paralinguistic
- Prosodic features
- Speaker Diarization
- Speaker Role Labeling
- speech recognition
- speech recognition.
- Spoken Language Understanding
- Spoken Term Detection (STD)
- Spontaneous Conversation
- Subs-ace Gaussian Mixture Models
- System Combination
- TANDEM features
- temporal modulations
- Turn-taking features
Publications of Fabio Valente
| 1 | 2 |
2012
| Annotation and Recognition of Personality Traits in Spoken Conversations from the AMI Meetings Corpus, , and , in: Proceedings of Interspeech 2012, 2012 |
|
| Application of Subspace Gaussian Mixture Models in Contrastive Acoustic Scenarios, , , and , Idiap-RR-20-2012 |
|
| Automatic detection of conflict escalation in spoken conversations, , and , in: INTERSPEECH, ISCA, Portland, Oregon, USA., 2012 |
|
| Automatic detection of conflicts in spoken conversations: ratings and analysis of broadcast political debates, , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012 |
|
| Automatic Speaker Role Labeling in AMI Meetings: Recognition of Formal and Social Roles, and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012, 2012 |
|
| Collecting data for socially intelligent surveillance and monitoring approaches: the case of conflict in competitive conversations, , , and , in: International Symposium on Communications, Control, and Signal Processing, 2012 |
|
| Detecting and Labeling Folk Literature in Spoken Cultural Heritage Archives using Structural and Prosodic Features, and , in: IEEE Content Based Multimedia Indexing, 2012 |
|
| DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings, and , in: Proceedings of Interspeech, 2012 |
|
| IMPROVING ACOUSTIC BASED KEYWORD SPOTTING USING LVCSR LATTICES, , and , Idiap-RR-36-2012 |
|
| IMPROVING ACOUSTIC BASED KEYWORD SPOTTING USING LVCSR LATTICES, , and , in: Proceedings on IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Japan, pages 4413-4416, 2012 |
| Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features, , and , in: Speech Communication, 54(1), 2012 |
[DOI] |
| Predicting the Conflict Level in Television Political Debates: an Approach Based on Crowdsourcing, Nonverbal Communication and Gaussian Processes, , , and , in: ACM Multimedia, 2012 |
| Speaker Diarization, and , in: Multimodal Signal Processing: Human Interactions in Meetings, Cambridge University Press, 2012 |
[URL] |
| Speaker Diarization of Meetings based on large TDOA feature vectors, and , in: Proceedings of International Conference on Acoustic, Speech and Signal Processing, 2012 |
|
| Speaker diarization of overlapping speech based on silence distribution in meeting recordings, and , in: INTERSPEECH, Portland, Oregon, USA, 2012 |
|
2011
| An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, , and , in: IEEE Transactions on Audio Speech and Language Processing, 19(2), 2011 |
[DOI] |
| Analysis and Comparison of Recent MLP Features for LVCSR Systems, , and , in: Proceedings of Interspeech 2011, 2011 |
|
| Current trends in multilingual speech processing, , , , , , , , and , in: Sadhana, 36(5):885–915, 2011 |
[DOI] [URL] |
| Data-driven extraction of spectral-dynamics based posteriors, , in: Handbook of Natural Language Processing and Machine Translation Handbook of Natural Language Processing and Machine Translation, Springer, 2011 |
[URL] |
| Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings, and , in: Interspeech, Florence, Italy, pages 953-956, 2011 |
|
| Language-Independent Socio-Emotional Role Recognition in the AMI Meetings Corpus, and , in: Proceedings of Interspeech, 2011 |
|
| MULTISTREAM SPEAKER DIARIZATION THROUGH INFORMATION BOTTLENECK SYSTEM OUTPUTS COMBINATION, , and , in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2011 |
|
| Speaker Diarization of Meetings based on Speaker Role N-gram Models, , and , in: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011 |
|
| Transcribing Mandarin Broadcast Speech Using Multi-Layer Perceptron Acoustic Features, , , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 19(8), 2011 |
[DOI] |
| Understanding Social Signals in Multi-party Conversations: Automatic Recognition of Socio-Emotional Roles in the AMI Meeting Corpus, , , and , in: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 374-379, 2011 |
2010
| A Comparative Study of MLP Front-ends for Mandarin ASR, , , , and , in: Proceedings of Interspeech, Japan, 2010 |
|
| Advances in Fast Multistream Diarization based on the Information Bottleneck Framework, , and , Idiap-RR-23-2010 |
|
| Advances in Fast Multistream Diarization based on the Information Bottleneck Framework, , and , in: Proceedings of Interspeech, 2010 |
| An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, , and , Idiap-RR-22-2010 |
|
| Application of Out-Of-Language Detection To Spoken-Term Detection, and , Idiap-RR-04-2010 |
|
| Application of Out-Of-Language Detection To Spoken-Term Detection, and , in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010 |
|
| English Spoken Term Detection in Multilingual Recordings, , and , Idiap-RR-21-2010 |
|
| English Spoken Term Detection in Multilingual Recordings, , and , in: Proceedings of Interspeech, Makuhari, Japan, 2010, ISCA, Makuhari, Japan, 2010 |
|
| Hierarchical and Parallel Processing of Auditory and Modulation Frequencies for Automatic Speech Recognition, , in: Speech Communication, 52(10):790-800, 2010 |
[DOI] |
| Improving Speech Processing trough Social Signals: Automatic Speaker Segmentation of Political Debates using Role based Turn-Taking Patterns., and , in: Proceedings of ACM Multimedia Workshop on Social Signal Processing, 2010 |
|
| KL Realignment for Speaker Diarization with Multiple Feature Streams, , and , Idiap-RR-24-2010 |
|
| Multi-Stream Speech Recognition based on Dempster-Shafer Combination Rule, , in: Speech Communication, 52(3):213-222, 2010 |
[DOI] |
| Multistream Speaker Diarization beyond Two Acoustic Feature Streams, , and , in: International Conference on Acoustics, Speech, and Signal Processing, 2010 |
|
| Social Signal Processing: Understanding Nonverbal Communication in Social Interactions, and , in: Proceedings of Measuring Behavior 2010, Eindhoven (The Netherlands), 2010 |
|
| VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS, , and , in: Proceedings of ICASSP, 2010 |
|
2009
| A Novel Criterion for Classifiers Combination in Multistream Speech Recognition, , in: IEEE Signal Processing Letters, 16(7), 2009 |
[DOI] |
| 1 | 2 |