logo Idiap Research Institute        
Project AMIDA
Name: AMIDA

Publications of AMIDA sorted by journal and type
| 1 | 2 |


Publications of type Idiap-RR


2014


2011


2010


2009


2008


Publications of type Idiap-Com


2010


IEEE Multimedia


IEEE Transactions on Audio, Speech, and Language Processing

The ICSI RT-09 Speaker Diarization System, Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera, Luke Gottlieb, Marijn Huijbregts, Mary Tai Knox and Oriol Vinyals, in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):371--381, 2012
[DOI]

IEEE Trans. on Pattern Analysis and Machine Intelligence

Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):101-116, 2011
attachment

IEEE Transactions on Audio Speech and Language Processing

An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 19(2), 2011
[DOI]

IEEE Transactions on Audio, Speech, and Language Processing

Estimating Dominance in Multi-Party Meetings Using Speaker Diarization, Hayley Hung, Yan Huang, Gerald Friedland and Daniel Gatica-Perez, in: IEEE Transactions on Audio, Speech, and Language Processing, 19(4):847-860, 2011
attachment

IEEE Trans. on Multimedia, Special Issue on Multimodal Affective Interaction

Estimating Cohesion in Small Groups Using Audio-Visual Nonverbal Behavior, Hayley Hung and Daniel Gatica-Perez, in: IEEE Trans. on Multimedia, Special Issue on Multimodal Affective Interaction, 12(6):563 - 575, 2010
attachment

IEEE Transactions on Audio Speech and Language Processing

An Information Theoretic Approach to Speaker Diarization of Meeting Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 17(7), 2009
attachment
[DOI]
Beamforming with a Maximum Negentropy Criterion, Kenichi Kumatani, John McDonough, Barbara Rauch, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: IEEE Transactions on Audio Speech and Language Processing, 17(5), 2009
attachment

IEEE Transactions on Systems, Man, Cybernetics, Part-B

Recognizing Human Visual Focus of Attention from Head Pose in Meetings, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Transactions on Systems, Man, Cybernetics, Part-B, Vol. 39(No. 1), 2009
attachment

IEEE Transactions on Audio, Speech and Language Processing


Journal of Acoustical Society of America - Express Letters

Modulation Frequency Features For Phoneme Recognition In Noisy Speech, Sriram Ganapathy, Samuel Thomas and Hynek Hermansky, in: Journal of Acoustical Society of America - Express Letters, 2008
attachment

Publications of type Book


2012


2008


Multimodal Signal Processing: Human Interactions in Meetings (2012)

Evaluation of Meeting Support Technology, Simon Tucker and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 237-252, Cambridge University Press, 2012
Multimodal Signal Processing for Meetings: an Introduction, Andrei Popescu-Belis and Jean Carletta, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 1-11, Cambridge University Press, 2012
attachment
Sampling techniques for audio-visual tracking and head pose estimation, Jean-Marc Odobez and Oswald Lanz, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 84-102, Cambridge University Press, 2012
attachment
Speaker Diarization, Fabio Valente and Gerald Friedland, in: Multimodal Signal Processing: Human Interactions in Meetings, Cambridge University Press, 2012
[URL]
User Requirements for Meeting Support Technology, Denis Lalanne and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 210-221, Cambridge University Press, 2012

Multimodal Corpora: From Models of Natural Interaction to Systems and Applications (2009)

Accessing a Large Multimodal Corpus using an Automatic Content Linking Device, Andrei Popescu-Belis, Jean Carletta, Jonathan Kilgour and Peter Poller, in: Multimodal Corpora: From Models of Natural Interaction to Systems and Applications, Springer-Verlag, 2009
attachment
[DOI]

Multimodal Signal Processing for Human-Computer Interaction (2009)

Managing Multimodal Data, Metadata and Annotations: Challenges and Solutions, Andrei Popescu-Belis, in: Multimodal Signal Processing for Human-Computer Interaction, Elsevier / Academic Press, 2009

Machine Learning for Multimodal Interaction IV (2008)


SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session (2011)

A Just-in-Time Document Retrieval System for Dialogues or Monologues, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, Portland, OR, pages 350-352, 2011
attachment

Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics) (2011)

A Speech-based Just-in-Time Retrieval System using Semantic Search, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), Portland, OR, pages 80-86, 2011
[URL]

LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010 (2010)

A Multimodal Corpus for Studying Dominance in Small Group Conversations, Oya Aran, Hayley Hung and Daniel Gatica-Perez, in: LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010, 2010
attachment

Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (2010)

An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, David Imseng and Gerald Friedland, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, pages 4946-4949, 2010
attachment

2010 IEEE International Conference on Acoustics, Speech and Signal Processing (2010)

Application of Out-Of-Language Detection To Spoken-Term Detection, Petr Motlicek and Fabio Valente, in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010
attachment

Proceedings of the 33rd Annual ACM SIGIR Conference (2010)


International Conference on Acoustics, Speech, and Signal Processing (2010)

Multistream Speaker Diarization beyond Two Acoustic Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: International Conference on Acoustics, Speech, and Signal Processing, 2010
attachment

ACM Multimedia Workshop on Searching Spontaneous Conversational Speech (2010)

The ACLD: Speech-based Just-in-Time Retrieval of Meeting Transcripts, Documents and Websites, Andrei Popescu-Belis, Jonathan Kilgour, Alexandre Nanchen and Peter Poller, in: ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Florence, Italy, 2010
attachment

Proceedings of Interspeech (2010)


ACM Multimedia (2010)


Proceedings of Interspeech (2010)

Tracter: A Lightweight Dataflow Framework, Philip N. Garner and John Dines, in: Proceedings of Interspeech, Makuhari, Japan, 2010
attachment

International Conference on Acoustics, Speech and Signal Processing (2010)

Using Audio and Visual Cues for Speaker Diarisation Initialisation, Giulia Garau and Hervé Bourlard, in: International Conference on Acoustics, Speech and Signal Processing, 2010
attachment

Proc. Int. Conf. on Computer Vision Theory and Applications (2010)

View-Based Appearance Model Online Learning for 3D Deformable Face Tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: Proc. Int. Conf. on Computer Vision Theory and Applications, Angers, 2010
attachment

International Conference on Speech and Language Processing, Interspeech (2010)

Audio–Visual Synchronisation for Speaker Diarisation, Giulia Garau, Alfred Dielmann and Hervé Bourlard, in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, 2010
attachment

Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction) (2009)

A Multimedia Retrieval System Using Speech Input, Andrei Popescu-Belis, Peter Poller, Jonathan Kilgour, Erik Boertjes, Jean Carletta, Sandro Castronovo, Michal Fapso, Alexandre Nanchen, Theresa Wilson, Joost de Wit and Majid Yazdani, in: Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction), Cambridge, MA, 2009
attachment

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09. (2009)

APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, Sriram Ganapathy, Samuel Thomas, Petr Motlicek and Hynek Hermansky, in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09., IEEE, Mohonk Mountain House, New Paltz, New York, USA, 2009
attachment
[URL]

10th Annual Conference of the International Speech Communication Association (2009)

Automatic vs. human question answering over multimedia meeting recordings, Quoc Anh Le and Andrei Popescu-Belis, in: 10th Annual Conference of the International Speech Communication Association, 2009
attachment

Proceedings ICME 2009 (2009)


Proceedings ICMI-MLMI (2009)


Proceedings of Interspeech 2009 (2009)


Proceedings of the ACM International Conference on Multimedia (2009)

Investigating the use of Visual Focus of Attention for Audio-Visual Speaker Diarisation, Giulia Garau, Silèye O. Ba, Hervé Bourlard and Jean-Marc Odobez, in: Proceedings of the ACM International Conference on Multimedia, Beijing, China, 2009
attachment

10th Annual Conference of the International Speech Communication Association (2009)

KL Realignment for Speaker Diarization with Multiple Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: 10th Annual Conference of the International Speech Communication Association, 2009

IEEE Int. Conference on Image Processing, Cairo, Egypt (2009)

Learning Large Margin Likelihood for Realtime Head Pose Tracking, Elisa Ricci and Jean-Marc Odobez, in: IEEE Int. Conference on Image Processing, Cairo, Egypt, IEEE, 2009
attachment

Proceedings of International Conference on Acoustics, Speech and Signal Processing (2009)

MUTUAL INFORMATION BASED CHANNEL SELECTION FOR SPEAKER DIARIZATION OF MEETINGS DATA, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009
attachment

Proceedings of International conference on acoustics speech and signal processing (2009)

Mutual Information based Channel Selection for Speaker Diarization of Meetings Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International conference on acoustics speech and signal processing, 2009

Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2009)

Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, Weifeng Li, John Dines, Mathew Magimai.-Doss and Hervé Bourlard, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009
attachment
Posterior features applied to speech recognition tasks with user-defined vocabulary, Guillermo Aradilla, Hervé Bourlard and Mathew Magimai.-Doss, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009
attachment

Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing, (2009)

Predicting Remote Versus Collocated Group Interactions using Nonverbal Cues, Dairazalia Sanchez-Cortes, Dinesh Babu Jayagopi and Daniel Gatica-Perez, in: Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing,, Cambridge, 2009
[DOI]

Proceedings of Interspeech (2009)


Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding (2009)

Robust Speaker Diarization for Short Speech Recordings, David Imseng and Gerald Friedland, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, pages 432-437, 2009
attachment

IEEE Proc. Int. Conf. on Multimedia and Expo (2009)

Structure and appearance features for robust 3D facial actions tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: IEEE Proc. Int. Conf. on Multimedia and Expo, IEEE, 2009
attachment

International Conference on Multimedia & Expo (2009)


ACM Multimedia (2009)


10thAnnual Conference of the International Speech Communication Association (2009)

Automatic Out-of-Language Detection Based on Confidence Measures Derived fromLVCSR Word and Phone Lattices, Petr Motlicek, in: 10thAnnual Conference of the International Speech Communication Association, ISCA, Brighton, England, 2009
attachment

Proceedings of the Joint Workshop on Hands-free Speech Communication and Microphone Arrays (2008)

Adaptive Beamforming with a Maximum Negentropy Criterion, Kenichi Kumatani, John McDonough, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: Proceedings of the Joint Workshop on Hands-free Speech Communication and Microphone Arrays, Italy, 2008
attachment

Proceedings of ICASSP 2008 (2008)


MobileHCI 2008 (10th International Conference on Human-Computer Interaction with Mobile Devices and Services, Demonstrations Session) (2008)

Graphical representation of meetings on mobile devices, Lukas Matena, Alejandro Jaimes and Andrei Popescu-Belis, in: MobileHCI 2008 (10th International Conference on Human-Computer Interaction with Mobile Devices and Services, Demonstrations Session), Amsterdam, 2008
attachment

International Conference on Automatic Face and Gesture Recognition (2008)

Identifying Dominant People in Meetings from Audio-Visual Sensors, Hayley Hung and Daniel Gatica-Perez, in: International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, 2008
attachment

International Conference on Multi-modal Interfaces (2008)


Proceedings of INTERSPEECH, September 2008 (2008)


Proceedings of the ACM International Conference on Multimedia (2008)


Proceedings of International Conference on Multimodal Interfaces (to appear) (2008)

Social Signals, their Function, and Automatic Analysis: A Survey, Alessandro Vinciarelli, Maja Pantic, Hervé Bourlard and Alex Pentland, in: Proceedings of International Conference on Multimodal Interfaces (to appear), 2008
attachment

6th International Conference on Language Resources and Evaluation (2008)

Task-based evaluation of meeting browsers: from BET task elicitation to user behavior analysis, Andrei Popescu-Belis, Mike Flynn, Pierre Wellner and Philippe Baudrion, in: 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 2008
attachment

Machine Learning for Multimodal Interaction V (2008)


European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion (2008)

Towards Audio-Visual On-line Diarization Of Participants In Group Meetings, Hayley Hung and Gerald Friedland, in: European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion, 2008
attachment

IEEE International Conference on Acoustics, Speech, and Signal Processing (2007)


Publications of type Phdthesis


2010

| 1 | 2 |