Publications of AMIDA sorted by recency
| 1 | 2 |
Feature Mapping of Multiple Beamformed Sources for Robust Overlapping Speech Recognition Using a Microphone Array, , , , , , and , Idiap-RR-17-2014 |
|
Sampling techniques for audio-visual tracking and head pose estimation, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 84-102, Cambridge University Press, 2012 |
|
The ICSI RT-09 Speaker Diarization System, , , , , , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):371--381, 2012 |
[DOI] |
Evaluation of Meeting Support Technology, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 237-252, Cambridge University Press, 2012 |
User Requirements for Meeting Support Technology, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 210-221, Cambridge University Press, 2012 |
Multimodal Signal Processing for Meetings: an Introduction, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 1-11, Cambridge University Press, 2012 |
|
Speaker Diarization, and , in: Multimodal Signal Processing: Human Interactions in Meetings, Cambridge University Press, 2012 |
[URL] |
Transcribing meetings with the AMIDA systems, , , , , , , , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):486--498, 2012 |
[DOI] [URL] |
Estimating Cohesion in Small Groups Using Audio-Visual Nonverbal Behavior, and , in: IEEE Trans. on Multimedia, Special Issue on Multimodal Affective Interaction, 12(6):563 - 575, 2010 |
|
Multimodal Signal Processing: Human Interactions in Meetings, , , and , Cambridge University Press, 2012 |
[URL] |
A Just-in-Time Document Retrieval System for Dialogues or Monologues, , , and , in: SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, Portland, OR, pages 350-352, 2011 |
|
Finding Information in Multimedia Records of Meetings, , and , in: IEEE Multimedia, 19(2):48-57, 2012 |
[DOI] [URL] |
A Speech-based Just-in-Time Retrieval System using Semantic Search, , , and , in: Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), Portland, OR, pages 80-86, 2011 |
[URL] |
A Speech-based Just-in-Time Retrieval System using Semantic Search, , , and , Idiap-RR-31-2011 |
|
When Users Meet Technology: The Meeting Browser Development Helix, , and , Idiap-RR-05-2011 |
|
Automatic Content Linking: Speech-based Just-in-time Retrieval for Multimedia Archives, , , , , and , in: Proceedings of the 33rd Annual ACM SIGIR Conference, Geneva, Switzerland, pages 703, 2010 |
[DOI] |
Finding Information in Multimedia Records of Meetings, , and , Idiap-RR-32-2011 |
|
View-Based Appearance Model Online Learning for 3D Deformable Face Tracking, and , in: Proc. Int. Conf. on Computer Vision Theory and Applications, Angers, 2010 |
|
An Information Theoretic Approach to Speaker Diarization of Meeting Recordings, , Ecole polytechnique fédérale de Lausanne, 2010 |
|
An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, , and , in: IEEE Transactions on Audio Speech and Language Processing, 19(2), 2011 |
[DOI] |
The Wolf Corpus: Exploring group behaviour in a competitive role-playing game, and , in: ACM Multimedia, 2010 |
|
Estimating Dominance in Multi-Party Meetings Using Speaker Diarization, , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 19(4):847-860, 2011 |
|
Predicting Remote Versus Collocated Group Interactions using Nonverbal Cues, , and , in: Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing,, Cambridge, 2009 |
[DOI] |
The ACLD: Speech-based Just-in-Time Retrieval of Meeting Transcripts, Documents and Websites, , , and , in: ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Florence, Italy, 2010 |
|
The ACLD: Speech-based Just-in-Time Retrieval of Multimedia Documents and Websites, , , and , Idiap-RR-26-2010 |
|
The AMIDA 2009 Meeting Transcription System, , , , , , , , and , in: Proceedings of Interspeech, Makuhari, Japan, 2010 |
|
Tracter: A Lightweight Dataflow Framework, and , in: Proceedings of Interspeech, Makuhari, Japan, 2010 |
|
Audio–Visual Synchronisation for Speaker Diarisation, , and , in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, 2010 |
|
Advances in Fast Multistream Diarization based on the Information Bottleneck Framework, , and , Idiap-RR-23-2010 |
|
Estimating Cohesion in Small Groups using Audio-Visual Nonverbal Behavior, and , Idiap-RR-12-2010 |
|
Tracter: A Lightweight Dataflow Framework, and , Idiap-RR-10-2010 |
|
A Multimodal Corpus for Studying Dominance in Small Group Conversations, , and , in: LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010, 2010 |
|
Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues, and , in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):101-116, 2011 |
|
Learning Large Margin Likelihood for Realtime Head Pose Tracking, and , in: IEEE Int. Conference on Image Processing, Cairo, Egypt, IEEE, 2009 |
|
Structure and appearance features for robust 3D facial actions tracking, and , in: IEEE Proc. Int. Conf. on Multimedia and Expo, IEEE, 2009 |
|
Finding without searching, , Idiap-Com-01-2010 |
|
Application of Out-Of-Language Detection To Spoken-Term Detection, and , in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010 |
|
Multistream Speaker Diarization beyond Two Acoustic Feature Streams, , and , in: International Conference on Acoustics, Speech, and Signal Processing, 2010 |
|
AMIDA/Klewel Mini-Project, , , and , Idiap-RR-03-2010 |
|
Using Audio and Visual Cues for Speaker Diarisation Initialisation, and , in: International Conference on Acoustics, Speech and Signal Processing, 2010 |
|
An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, pages 4946-4949, 2010 |
|
A Multimedia Retrieval System Using Speech Input, , , , , , , , , , and , in: Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction), Cambridge, MA, 2009 |
|
Managing Multimodal Data, Metadata and Annotations: Challenges and Solutions, , in: Multimodal Signal Processing for Human-Computer Interaction, Elsevier / Academic Press, 2009 |
Accessing a Large Multimodal Corpus using an Automatic Content Linking Device, , , and , in: Multimodal Corpora: From Models of Natural Interaction to Systems and Applications, Springer-Verlag, 2009 |
[DOI] |
User Interface Design in a Just-in-time Retrieval System for Meetings, , , , , , and , Idiap-RR-38-2009 |
|
APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, , , and , in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09., IEEE, Mohonk Mountain House, New Paltz, New York, USA, 2009 |
[URL] |
APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, , , and , Idiap-RR-35-2009 |
|
Analysis of F0 and Cepstral Features for Robust Automatic Gender Recognition, and , Idiap-RR-30-2009 |
|
An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, and , Idiap-RR-02-2010 |
|
Application of Out-Of-Language Detection To Spoken-Term Detection, and , Idiap-RR-04-2010 |
|
Automatic Out-of-Language Detection Based on Confidence Measures Derived fromLVCSR Word and Phone Lattices, , in: 10thAnnual Conference of the International Speech Communication Association, ISCA, Brighton, England, 2009 |
|
Robust Speaker Diarization for Short Speech Recordings, and , in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, pages 432-437, 2009 |
|
Robust Speaker Diarization for Short Speech Recordings, and , Idiap-RR-26-2009 |
|
Discovering Group Nonverbal Conversational Patterns with Topics, and , in: Proceedings ICMI-MLMI, 2009 |
|
Investigating the use of Visual Focus of Attention for Audio-Visual Speaker Diarisation, , , and , in: Proceedings of the ACM International Conference on Multimedia, Beijing, China, 2009 |
|
Visual Speaker Localization Aided by Acoustic Models, , and , in: ACM Multimedia, 2009 |
KL Realignment for Speaker Diarization with Multiple Feature Streams, , and , in: 10th Annual Conference of the International Speech Communication Association, 2009 |
Mutual Information based Channel Selection for Speaker Diarization of Meetings Data, , and , in: Proceedings of International conference on acoustics speech and signal processing, 2009 |
Real-Time ASR from Meetings, , , , , , , , and , in: Proceedings of Interspeech, Brighton, UK., 2009 |
|
Automatic vs. human question answering over multimedia meeting recordings, and , in: 10th Annual Conference of the International Speech Communication Association, 2009 |
|
Automatic vs. human question answering over multimedia meeting recordings, and , Idiap-RR-13-2009 |
|
Investigating Privacy-Sensitive Features for Speech Detection in Multiparty Conversations, , , and , in: Proceedings of Interspeech 2009, 2009 |
|
Comparing meeting browsers using a task-based evaluation method, , Idiap-RR-11-2009 |
|
KL Realignment for Speaker Diarization with Multiple Feature Streams, , and , Idiap-RR-24-2010 |
|
Novel initialization methods for Speaker Diarization, , Idiap-RR-07-2009 |
|
Characterising Conversationsal Group Dynamics Using Nonverbal Behaviour, , and , in: Proceedings ICME 2009, 2009 |
|
Real-Time ASR from Meetings, , , , , , , , and , Idiap-RR-15-2009 |
|
Visual Activity Context For Focus of Attention Estimation in Dynamic Meetings, , and , in: International Conference on Multimedia & Expo, 2009 |
|
Posterior features applied to speech recognition tasks with user-defined vocabulary, , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
|
Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
|
Visual activity context for focus of attention estimation in dynamic meetings, , and , Idiap-RR-02-2009 |
|
MUTUAL INFORMATION BASED CHANNEL SELECTION FOR SPEAKER DIARIZATION OF MEETINGS DATA, , and , in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009 |
|
An Information Theoretic Approach to Speaker Diarization of Meeting Data, , and , in: IEEE Transactions on Audio Speech and Language Processing, 17(7), 2009 |
[DOI] |
Beamforming with a Maximum Negentropy Criterion, , , , , and , in: IEEE Transactions on Audio Speech and Language Processing, 17(5), 2009 |
|
Adaptive Beamforming with a Maximum Negentropy Criterion, , , , and , in: Proceedings of the Joint Workshop on Hands-free Speech Communication and Microphone Arrays, Italy, 2008 |
|
Filter Bank Design based on Minimization of Individual Aliasing Terms for Minimum Mutual Information Subband Adaptive Beamforming, , , , , and , in: Proceedings of ICASSP 2008, Las Vegas, USA, 2008 |
|
Recognizing Human Visual Focus of Attention from Head Pose in Meetings, and , in: IEEE Transactions on Systems, Man, Cybernetics, Part-B, Vol. 39(No. 1), 2009 |
|
Maximum kurtosis beamforming with the generalized sidelobe canceller, , , , , and , in: Proceedings of INTERSPEECH, September 2008, Brisbane, Australia, 2008 |
|
Automatic Out-of-Language Detection based on Confidence Measures derived from LVCSR Word and Phone Lattices, , Idiap-RR-06-2009 |
|
Modeling Dominance in Group Conversations using NonVerbal Activity Cues, , , and , in: IEEE Transactions on Audio, Speech and Language Processing, 2008 |
|
Phoneme Recognition Using Spectral Envelope and Modulation Frequency Features, , and , Idiap-RR-04-2009 |
|
Integrating audio and vision for robust automatic gender recognition, and , Idiap-RR-73-2008 |
|
Modulation Frequency Features For Phoneme Recognition In Noisy Speech, , and , in: Journal of Acoustical Society of America - Express Letters, 2008 |
|
Modulation Frequency Features For Phoneme Recognition In Noisy Speech, , and , Idiap-RR-70-2008 |
|
ESTIMATING THE DOMINANT PERSON IN MULTI-PARTY CONVERSATIONS USING SPEAKER DIARIZATION STRATEGIES, , , and , in: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2007 |
Investigating Automatic Dominance Estimation in Groups From Visual Attention and Speaking Activity, , , , and , in: International Conference on Multi-modal Interfaces, 2008 |
|
Towards Audio-Visual On-line Diarization Of Participants In Group Meetings, and , in: European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion, 2008 |
|
Graphical representation of meetings on mobile devices, , and , in: MobileHCI 2008 (10th International Conference on Human-Computer Interaction with Mobile Devices and Services, Demonstrations Session), Amsterdam, 2008 |
|
Task-based evaluation of meeting browsers: from BET task elicitation to user behavior analysis, , , and , in: 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 2008 |
|
The AMIDA Automatic Content Linking Device: Just-in-Time Document Retrieval in Meetings, , , , , , , and , in: Machine Learning for Multimodal Interaction V, Utrecht, Springer-Verlag, 2008 |
[DOI] |
Towards an Objective Test for Meeting Browsers: the BET4TQB Pilot Experiment, , , and , in: Machine Learning for Multimodal Interaction IV, Springer-Verlag, 2008 |
[DOI] |
Machine Learning for Multimodal Interaction V, and , Springer-Verlag, LNCS, volume 5237, 2008 |
[DOI] |
Machine Learning for Multimodal Interaction IV, , and , Springer-Verlag, LNCS, volume 4892, 2008 |
[DOI] |
Social Signal Processing: State-of-the-Art and Future Perspectives of an Emerging Domain, , , and , in: Proceedings of the ACM International Conference on Multimedia, 2008 |
|
Social Signals, their Function, and Automatic Analysis: A Survey, , , and , in: Proceedings of International Conference on Multimodal Interfaces (to appear), 2008 |
|
Identifying Dominant People in Meetings from Audio-Visual Sensors, and , in: International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, 2008 |
|
Identifying Dominant People in Meetings from Audio-Visual Sensors, and , Idiap-RR-65-2008 |
|
| 1 | 2 |