logo Idiap Research Institute        
Project AMIDA
Name: AMIDA

Publications of project AMIDA
| 1 | 2 |

2014
2012
Evaluation of Meeting Support Technology, Simon Tucker and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 237-252, Cambridge University Press, 2012
Multimodal Signal Processing for Meetings: an Introduction, Andrei Popescu-Belis and Jean Carletta, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 1-11, Cambridge University Press, 2012
attachment
Sampling techniques for audio-visual tracking and head pose estimation, Jean-Marc Odobez and Oswald Lanz, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 84-102, Cambridge University Press, 2012
attachment
Speaker Diarization, Fabio Valente and Gerald Friedland, in: Multimodal Signal Processing: Human Interactions in Meetings, Cambridge University Press, 2012
[URL]
The ICSI RT-09 Speaker Diarization System, Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera, Luke Gottlieb, Marijn Huijbregts, Mary Tai Knox and Oriol Vinyals, in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):371--381, 2012
[DOI]
User Requirements for Meeting Support Technology, Denis Lalanne and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 210-221, Cambridge University Press, 2012
2011
A Just-in-Time Document Retrieval System for Dialogues or Monologues, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, Portland, OR, pages 350-352, 2011
attachment
A Speech-based Just-in-Time Retrieval System using Semantic Search, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), Portland, OR, pages 80-86, 2011
[URL]
An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 19(2), 2011
[DOI]
Estimating Dominance in Multi-Party Meetings Using Speaker Diarization, Hayley Hung, Yan Huang, Gerald Friedland and Daniel Gatica-Perez, in: IEEE Transactions on Audio, Speech, and Language Processing, 19(4):847-860, 2011
attachment
Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):101-116, 2011
attachment
2010
A Multimodal Corpus for Studying Dominance in Small Group Conversations, Oya Aran, Hayley Hung and Daniel Gatica-Perez, in: LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010, 2010
attachment
An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, David Imseng and Gerald Friedland, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, pages 4946-4949, 2010
attachment
Application of Out-Of-Language Detection To Spoken-Term Detection, Petr Motlicek and Fabio Valente, in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010
attachment
Audio–Visual Synchronisation for Speaker Diarisation, Giulia Garau, Alfred Dielmann and Hervé Bourlard, in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, 2010
attachment
Estimating Cohesion in Small Groups Using Audio-Visual Nonverbal Behavior, Hayley Hung and Daniel Gatica-Perez, in: IEEE Trans. on Multimedia, Special Issue on Multimodal Affective Interaction, 12(6):563 - 575, 2010
attachment
Multistream Speaker Diarization beyond Two Acoustic Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: International Conference on Acoustics, Speech, and Signal Processing, 2010
attachment
The ACLD: Speech-based Just-in-Time Retrieval of Meeting Transcripts, Documents and Websites, Andrei Popescu-Belis, Jonathan Kilgour, Alexandre Nanchen and Peter Poller, in: ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Florence, Italy, 2010
attachment
Tracter: A Lightweight Dataflow Framework, Philip N. Garner and John Dines, in: Proceedings of Interspeech, Makuhari, Japan, 2010
attachment
Using Audio and Visual Cues for Speaker Diarisation Initialisation, Giulia Garau and Hervé Bourlard, in: International Conference on Acoustics, Speech and Signal Processing, 2010
attachment
View-Based Appearance Model Online Learning for 3D Deformable Face Tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: Proc. Int. Conf. on Computer Vision Theory and Applications, Angers, 2010
attachment
2009
A Multimedia Retrieval System Using Speech Input, Andrei Popescu-Belis, Peter Poller, Jonathan Kilgour, Erik Boertjes, Jean Carletta, Sandro Castronovo, Michal Fapso, Alexandre Nanchen, Theresa Wilson, Joost de Wit and Majid Yazdani, in: Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction), Cambridge, MA, 2009
attachment
Accessing a Large Multimodal Corpus using an Automatic Content Linking Device, Andrei Popescu-Belis, Jean Carletta, Jonathan Kilgour and Peter Poller, in: Multimodal Corpora: From Models of Natural Interaction to Systems and Applications, Springer-Verlag, 2009
attachment
[DOI]
An Information Theoretic Approach to Speaker Diarization of Meeting Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 17(7), 2009
attachment
[DOI]
APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, Sriram Ganapathy, Samuel Thomas, Petr Motlicek and Hynek Hermansky, in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09., IEEE, Mohonk Mountain House, New Paltz, New York, USA, 2009
attachment
[URL]
Automatic Out-of-Language Detection Based on Confidence Measures Derived fromLVCSR Word and Phone Lattices, Petr Motlicek, in: 10thAnnual Conference of the International Speech Communication Association, ISCA, Brighton, England, 2009
attachment
Automatic vs. human question answering over multimedia meeting recordings, Quoc Anh Le and Andrei Popescu-Belis, in: 10th Annual Conference of the International Speech Communication Association, 2009
attachment
Beamforming with a Maximum Negentropy Criterion, Kenichi Kumatani, John McDonough, Barbara Rauch, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: IEEE Transactions on Audio Speech and Language Processing, 17(5), 2009
attachment
Investigating the use of Visual Focus of Attention for Audio-Visual Speaker Diarisation, Giulia Garau, Silèye O. Ba, Hervé Bourlard and Jean-Marc Odobez, in: Proceedings of the ACM International Conference on Multimedia, Beijing, China, 2009
attachment
KL Realignment for Speaker Diarization with Multiple Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: 10th Annual Conference of the International Speech Communication Association, 2009
Learning Large Margin Likelihood for Realtime Head Pose Tracking, Elisa Ricci and Jean-Marc Odobez, in: IEEE Int. Conference on Image Processing, Cairo, Egypt, IEEE, 2009
attachment
Managing Multimodal Data, Metadata and Annotations: Challenges and Solutions, Andrei Popescu-Belis, in: Multimodal Signal Processing for Human-Computer Interaction, Elsevier / Academic Press, 2009
MUTUAL INFORMATION BASED CHANNEL SELECTION FOR SPEAKER DIARIZATION OF MEETINGS DATA, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009
attachment
Mutual Information based Channel Selection for Speaker Diarization of Meetings Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International conference on acoustics speech and signal processing, 2009
Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, Weifeng Li, John Dines, Mathew Magimai.-Doss and Hervé Bourlard, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009
attachment
Posterior features applied to speech recognition tasks with user-defined vocabulary, Guillermo Aradilla, Hervé Bourlard and Mathew Magimai.-Doss, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009
attachment
Predicting Remote Versus Collocated Group Interactions using Nonverbal Cues, Dairazalia Sanchez-Cortes, Dinesh Babu Jayagopi and Daniel Gatica-Perez, in: Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing,, Cambridge, 2009
[DOI]
Recognizing Human Visual Focus of Attention from Head Pose in Meetings, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Transactions on Systems, Man, Cybernetics, Part-B, Vol. 39(No. 1), 2009
attachment
Robust Speaker Diarization for Short Speech Recordings, David Imseng and Gerald Friedland, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, pages 432-437, 2009
attachment
Structure and appearance features for robust 3D facial actions tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: IEEE Proc. Int. Conf. on Multimedia and Expo, IEEE, 2009
attachment
2008
Adaptive Beamforming with a Maximum Negentropy Criterion, Kenichi Kumatani, John McDonough, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: Proceedings of the Joint Workshop on Hands-free Speech Communication and Microphone Arrays, Italy, 2008
attachment
Graphical representation of meetings on mobile devices, Lukas Matena, Alejandro Jaimes and Andrei Popescu-Belis, in: MobileHCI 2008 (10th International Conference on Human-Computer Interaction with Mobile Devices and Services, Demonstrations Session), Amsterdam, 2008
attachment
Identifying Dominant People in Meetings from Audio-Visual Sensors, Hayley Hung and Daniel Gatica-Perez, in: International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, 2008
attachment
Modulation Frequency Features For Phoneme Recognition In Noisy Speech, Sriram Ganapathy, Samuel Thomas and Hynek Hermansky, in: Journal of Acoustical Society of America - Express Letters, 2008
attachment
Social Signals, their Function, and Automatic Analysis: A Survey, Alessandro Vinciarelli, Maja Pantic, Hervé Bourlard and Alex Pentland, in: Proceedings of International Conference on Multimodal Interfaces (to appear), 2008
attachment
Task-based evaluation of meeting browsers: from BET task elicitation to user behavior analysis, Andrei Popescu-Belis, Mike Flynn, Pierre Wellner and Philippe Baudrion, in: 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 2008
attachment
Towards Audio-Visual On-line Diarization Of Participants In Group Meetings, Hayley Hung and Gerald Friedland, in: European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion, 2008
attachment
2007
| 1 | 2 |