AMIDA - Idiap Publications

Update cookies preferences

Name:

AMIDA

| 1 | 2 |

Feature Mapping of Multiple Beamformed Sources for Robust Overlapping Speech Recognition Using a Microphone Array, Weifeng Li, Longbiao Wang, Yicong Zhou, John Dines, Mathew Magimai-Doss, Hervé Bourlard and Qingmin Liao, Idiap-RR-17-2014

attachment

A Speech-based Just-in-Time Retrieval System using Semantic Search, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, Idiap-RR-31-2011

attachment

Finding Information in Multimedia Records of Meetings, Andrei Popescu-Belis, Denis Lalanne and Hervé Bourlard, Idiap-RR-32-2011

attachment

When Users Meet Technology: The Meeting Browser Development Helix, Andrei Popescu-Belis, Denis Lalanne and Hervé Bourlard, Idiap-RR-05-2011

attachment

Advances in Fast Multistream Diarization based on the Information Bottleneck Framework, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, Idiap-RR-23-2010

attachment

AMIDA/Klewel Mini-Project, Petr Motlicek, Philip N. Garner, Maël Guillemot and Vincent Bozzo, Idiap-RR-03-2010

attachment

An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, David Imseng and Gerald Friedland, Idiap-RR-02-2010

attachment

Application of Out-Of-Language Detection To Spoken-Term Detection, Petr Motlicek and Fabio Valente, Idiap-RR-04-2010

attachment

Estimating Cohesion in Small Groups using Audio-Visual Nonverbal Behavior, Hayley Hung and Daniel Gatica-Perez, Idiap-RR-12-2010

attachment

KL Realignment for Speaker Diarization with Multiple Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, Idiap-RR-24-2010

attachment

The ACLD: Speech-based Just-in-Time Retrieval of Multimedia Documents and Websites, Andrei Popescu-Belis, Jonathan Kilgour, Alexandre Nanchen and Peter Poller, Idiap-RR-26-2010

attachment

Tracter: A Lightweight Dataflow Framework, Philip N. Garner and John Dines, Idiap-RR-10-2010

attachment

Analysis of F0 and Cepstral Features for Robust Automatic Gender Recognition, Marianna Pronobis and Mathew Magimai-Doss, Idiap-RR-30-2009

attachment

APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, Sriram Ganapathy, Samuel Thomas, Petr Motlicek and Hynek Hermansky, Idiap-RR-35-2009

attachment

Automatic Out-of-Language Detection based on Confidence Measures derived from LVCSR Word and Phone Lattices, Petr Motlicek, Idiap-RR-06-2009

attachment

Automatic vs. human question answering over multimedia meeting recordings, Quoc Anh Le and Andrei Popescu-Belis, Idiap-RR-13-2009

attachment

Comparing meeting browsers using a task-based evaluation method, Andrei Popescu-Belis, Idiap-RR-11-2009

attachment

Novel initialization methods for Speaker Diarization, David Imseng, Idiap-RR-07-2009

attachment

Phoneme Recognition Using Spectral Envelope and Modulation Frequency Features, Samuel Thomas, Sriram Ganapathy and Hynek Hermansky, Idiap-RR-04-2009

attachment

Real-Time ASR from Meetings, Philip N. Garner, John Dines, Thomas Hain, Asmaa El Hannani, Martin Karafiat, Danil Korchagin, Mike Lincoln, Vincent Wan and Le Zhang, Idiap-RR-15-2009

attachment

Robust Speaker Diarization for Short Speech Recordings, David Imseng and Gerald Friedland, Idiap-RR-26-2009

attachment

User Interface Design in a Just-in-time Retrieval System for Meetings, Andrei Popescu-Belis, Peter Poller, Jonathan Kilgour, Mike Flynn, Sebastian Germesin, Alexandre Nanchen and Majid Yazdani, Idiap-RR-38-2009

attachment

Visual activity context for focus of attention estimation in dynamic meetings, Silèye O. Ba, Hayley Hung and Jean-Marc Odobez, Idiap-RR-02-2009

attachment

Identifying Dominant People in Meetings from Audio-Visual Sensors, Hayley Hung and Daniel Gatica-Perez, Idiap-RR-65-2008

attachment

Integrating audio and vision for robust automatic gender recognition, Marianna Pronobis and Mathew Magimai-Doss, Idiap-RR-73-2008

attachment

Modulation Frequency Features For Phoneme Recognition In Noisy Speech, Sriram Ganapathy, Samuel Thomas and Hynek Hermansky, Idiap-RR-70-2008

attachment

Finding without searching, Andrei Popescu-Belis, Idiap-Com-01-2010

attachment

Finding Information in Multimedia Records of Meetings, Andrei Popescu-Belis, Denis Lalanne and Hervé Bourlard, in: IEEE Multimedia, 19(2):48-57, 2012

[DOI]
[URL]

The ICSI RT-09 Speaker Diarization System, Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera, Luke Gottlieb, Marijn Huijbregts, Mary Tai Knox and Oriol Vinyals, in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):371--381, 2012

[DOI]

Transcribing meetings with the AMIDA systems, Thomas Hain, Lukas Burget, John Dines, Philip N. Garner, Frantisek Grezl, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiat, Mike Lincoln and Vincent Wan, in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):486--498, 2012

attachment

[DOI]
[URL]

Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):101-116, 2011

attachment

An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 19(2), 2011

[DOI]

Estimating Dominance in Multi-Party Meetings Using Speaker Diarization, Hayley Hung, Yan Huang, Gerald Friedland and Daniel Gatica-Perez, in: IEEE Transactions on Audio, Speech, and Language Processing, 19(4):847-860, 2011

attachment

Estimating Cohesion in Small Groups Using Audio-Visual Nonverbal Behavior, Hayley Hung and Daniel Gatica-Perez, in: IEEE Trans. on Multimedia, Special Issue on Multimodal Affective Interaction, 12(6):563 - 575, 2010

attachment

An Information Theoretic Approach to Speaker Diarization of Meeting Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 17(7), 2009

attachment

[DOI]

Beamforming with a Maximum Negentropy Criterion, Kenichi Kumatani, John McDonough, Barbara Rauch, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: IEEE Transactions on Audio Speech and Language Processing, 17(5), 2009

attachment

Recognizing Human Visual Focus of Attention from Head Pose in Meetings, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Transactions on Systems, Man, Cybernetics, Part-B, Vol. 39(No. 1), 2009

attachment

Modeling Dominance in Group Conversations using NonVerbal Activity Cues, Dinesh Babu Jayagopi, Hayley Hung, Chuohao Yeo and Daniel Gatica-Perez, in: IEEE Transactions on Audio, Speech and Language Processing, 2008

attachment

Modulation Frequency Features For Phoneme Recognition In Noisy Speech, Sriram Ganapathy, Samuel Thomas and Hynek Hermansky, in: Journal of Acoustical Society of America - Express Letters, 2008

attachment

Multimodal Signal Processing: Human Interactions in Meetings, Steve Renals, Hervé Bourlard, Jean Carletta and Andrei Popescu-Belis, Cambridge University Press, 2012

[URL]

Machine Learning for Multimodal Interaction IV, Andrei Popescu-Belis, Hervé Bourlard and Steve Renals, Springer-Verlag, LNCS, volume 4892, 2008

[DOI]

Machine Learning for Multimodal Interaction V, Andrei Popescu-Belis and Rainer Stiefelhagen, Springer-Verlag, LNCS, volume 5237, 2008

[DOI]

Evaluation of Meeting Support Technology, Simon Tucker and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 237-252, Cambridge University Press, 2012

Multimodal Signal Processing for Meetings: an Introduction, Andrei Popescu-Belis and Jean Carletta, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 1-11, Cambridge University Press, 2012

attachment

Sampling techniques for audio-visual tracking and head pose estimation, Jean-Marc Odobez and Oswald Lanz, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 84-102, Cambridge University Press, 2012

attachment

Speaker Diarization, Fabio Valente and Gerald Friedland, in: Multimodal Signal Processing: Human Interactions in Meetings, Cambridge University Press, 2012

[URL]

User Requirements for Meeting Support Technology, Denis Lalanne and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 210-221, Cambridge University Press, 2012

Accessing a Large Multimodal Corpus using an Automatic Content Linking Device, Andrei Popescu-Belis, Jean Carletta, Jonathan Kilgour and Peter Poller, in: Multimodal Corpora: From Models of Natural Interaction to Systems and Applications, Springer-Verlag, 2009

attachment

[DOI]

Managing Multimodal Data, Metadata and Annotations: Challenges and Solutions, Andrei Popescu-Belis, in: Multimodal Signal Processing for Human-Computer Interaction, Elsevier / Academic Press, 2009

Towards an Objective Test for Meeting Browsers: the BET4TQB Pilot Experiment, Andrei Popescu-Belis, Philippe Baudrion, Mike Flynn and Pierre Wellner, in: Machine Learning for Multimodal Interaction IV, Springer-Verlag, 2008

attachment

[DOI]

A Just-in-Time Document Retrieval System for Dialogues or Monologues, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, Portland, OR, pages 350-352, 2011

attachment

A Speech-based Just-in-Time Retrieval System using Semantic Search, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), Portland, OR, pages 80-86, 2011

[URL]

A Multimodal Corpus for Studying Dominance in Small Group Conversations, Oya Aran, Hayley Hung and Daniel Gatica-Perez, in: LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010, 2010

attachment

An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, David Imseng and Gerald Friedland, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, pages 4946-4949, 2010

attachment

Application of Out-Of-Language Detection To Spoken-Term Detection, Petr Motlicek and Fabio Valente, in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010

attachment

Automatic Content Linking: Speech-based Just-in-time Retrieval for Multimedia Archives, Andrei Popescu-Belis, Jonathan Kilgour, Peter Poller, Alexandre Nanchen, Erik Boertjes and Joost de Wit, in: Proceedings of the 33rd Annual ACM SIGIR Conference, Geneva, Switzerland, pages 703, 2010

[DOI]

Multistream Speaker Diarization beyond Two Acoustic Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: International Conference on Acoustics, Speech, and Signal Processing, 2010

attachment

The ACLD: Speech-based Just-in-Time Retrieval of Meeting Transcripts, Documents and Websites, Andrei Popescu-Belis, Jonathan Kilgour, Alexandre Nanchen and Peter Poller, in: ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Florence, Italy, 2010

attachment

The AMIDA 2009 Meeting Transcription System, Thomas Hain, Lukas Burget, John Dines, Philip N. Garner, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiat, Mike Lincoln and Vincent Wan, in: Proceedings of Interspeech, Makuhari, Japan, 2010

attachment

The Wolf Corpus: Exploring group behaviour in a competitive role-playing game, Hayley Hung and Gokul Chittaranjan, in: ACM Multimedia, 2010

attachment

Tracter: A Lightweight Dataflow Framework, Philip N. Garner and John Dines, in: Proceedings of Interspeech, Makuhari, Japan, 2010

attachment

Using Audio and Visual Cues for Speaker Diarisation Initialisation, Giulia Garau and Hervé Bourlard, in: International Conference on Acoustics, Speech and Signal Processing, 2010

attachment

View-Based Appearance Model Online Learning for 3D Deformable Face Tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: Proc. Int. Conf. on Computer Vision Theory and Applications, Angers, 2010

attachment

Audioâ€“Visual Synchronisation for Speaker Diarisation, Giulia Garau, Alfred Dielmann and Hervé Bourlard, in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, 2010

attachment

A Multimedia Retrieval System Using Speech Input, Andrei Popescu-Belis, Peter Poller, Jonathan Kilgour, Erik Boertjes, Jean Carletta, Sandro Castronovo, Michal Fapso, Alexandre Nanchen, Theresa Wilson, Joost de Wit and Majid Yazdani, in: Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction), Cambridge, MA, 2009

attachment

APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, Sriram Ganapathy, Samuel Thomas, Petr Motlicek and Hynek Hermansky, in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09., IEEE, Mohonk Mountain House, New Paltz, New York, USA, 2009

attachment

[URL]

Automatic vs. human question answering over multimedia meeting recordings, Quoc Anh Le and Andrei Popescu-Belis, in: 10th Annual Conference of the International Speech Communication Association, 2009

attachment

Characterising Conversationsal Group Dynamics Using Nonverbal Behaviour, Dinesh Babu Jayagopi, Raducanu Bogdan and Daniel Gatica-Perez, in: Proceedings ICME 2009, 2009

attachment

Discovering Group Nonverbal Conversational Patterns with Topics, Dinesh Babu Jayagopi and Daniel Gatica-Perez, in: Proceedings ICMI-MLMI, 2009

attachment

Investigating Privacy-Sensitive Features for Speech Detection in Multiparty Conversations, Sree Hari Krishnan Parthasarathi, Mathew Magimai-Doss, Hervé Bourlard and Daniel Gatica-Perez, in: Proceedings of Interspeech 2009, 2009

attachment

Investigating the use of Visual Focus of Attention for Audio-Visual Speaker Diarisation, Giulia Garau, Silèye O. Ba, Hervé Bourlard and Jean-Marc Odobez, in: Proceedings of the ACM International Conference on Multimedia, Beijing, China, 2009

attachment

KL Realignment for Speaker Diarization with Multiple Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: 10th Annual Conference of the International Speech Communication Association, 2009

Learning Large Margin Likelihood for Realtime Head Pose Tracking, Elisa Ricci and Jean-Marc Odobez, in: IEEE Int. Conference on Image Processing, Cairo, Egypt, IEEE, 2009

attachment

MUTUAL INFORMATION BASED CHANNEL SELECTION FOR SPEAKER DIARIZATION OF MEETINGS DATA, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009

attachment

Mutual Information based Channel Selection for Speaker Diarization of Meetings Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International conference on acoustics speech and signal processing, 2009

Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, Weifeng Li, John Dines, Mathew Magimai-Doss and Hervé Bourlard, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009

attachment

Posterior features applied to speech recognition tasks with user-defined vocabulary, Guillermo Aradilla, Hervé Bourlard and Mathew Magimai-Doss, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009

attachment

Predicting Remote Versus Collocated Group Interactions using Nonverbal Cues, Dairazalia Sanchez-Cortes, Dinesh Babu Jayagopi and Daniel Gatica-Perez, in: Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing,, Cambridge, 2009

[DOI]

Real-Time ASR from Meetings, Philip N. Garner, John Dines, Thomas Hain, Asmaa El Hannani, Martin Karafiat, Danil Korchagin, Mike Lincoln, Vincent Wan and Le Zhang, in: Proceedings of Interspeech, Brighton, UK., 2009

attachment

Robust Speaker Diarization for Short Speech Recordings, David Imseng and Gerald Friedland, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, pages 432-437, 2009

attachment

Structure and appearance features for robust 3D facial actions tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: IEEE Proc. Int. Conf. on Multimedia and Expo, IEEE, 2009

attachment

Visual Activity Context For Focus of Attention Estimation in Dynamic Meetings, Silèye O. Ba, Hayley Hung and Jean-Marc Odobez, in: International Conference on Multimedia & Expo, 2009

attachment

Visual Speaker Localization Aided by Acoustic Models, Gerald Friedland, Chuohao Yeo and Hayley Hung, in: ACM Multimedia, 2009

Automatic Out-of-Language Detection Based on Confidence Measures Derived fromLVCSR Word and Phone Lattices, Petr Motlicek, in: 10thAnnual Conference of the International Speech Communication Association, ISCA, Brighton, England, 2009

attachment

Adaptive Beamforming with a Maximum Negentropy Criterion, Kenichi Kumatani, John McDonough, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: Proceedings of the Joint Workshop on Hands-free Speech Communication and Microphone Arrays, Italy, 2008

attachment

Filter Bank Design based on Minimization of Individual Aliasing Terms for Minimum Mutual Information Subband Adaptive Beamforming, Kenichi Kumatani, John McDonough, Stefan Schacht, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: Proceedings of ICASSP 2008, Las Vegas, USA, 2008

attachment

Graphical representation of meetings on mobile devices, Lukas Matena, Alejandro Jaimes and Andrei Popescu-Belis, in: MobileHCI 2008 (10th International Conference on Human-Computer Interaction with Mobile Devices and Services, Demonstrations Session), Amsterdam, 2008

attachment

Identifying Dominant People in Meetings from Audio-Visual Sensors, Hayley Hung and Daniel Gatica-Perez, in: International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, 2008

attachment

Investigating Automatic Dominance Estimation in Groups From Visual Attention and Speaking Activity, Hayley Hung, Dinesh Babu Jayagopi, Silèye O. Ba, Jean-Marc Odobez and Daniel Gatica-Perez, in: International Conference on Multi-modal Interfaces, 2008

attachment

Maximum kurtosis beamforming with the generalized sidelobe canceller, Kenichi Kumatani, John McDonough, Barbara Rauch, Philip N. Garner, Weifeng Li and John Dines, in: Proceedings of INTERSPEECH, September 2008, Brisbane, Australia, 2008

attachment

Social Signal Processing: State-of-the-Art and Future Perspectives of an Emerging Domain, Alessandro Vinciarelli, Maja Pantic, Hervé Bourlard and Alex Pentland, in: Proceedings of the ACM International Conference on Multimedia, 2008

attachment

Social Signals, their Function, and Automatic Analysis: A Survey, Alessandro Vinciarelli, Maja Pantic, Hervé Bourlard and Alex Pentland, in: Proceedings of International Conference on Multimodal Interfaces (to appear), 2008

attachment

Task-based evaluation of meeting browsers: from BET task elicitation to user behavior analysis, Andrei Popescu-Belis, Mike Flynn, Pierre Wellner and Philippe Baudrion, in: 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 2008

attachment

The AMIDA Automatic Content Linking Device: Just-in-Time Document Retrieval in Meetings, Andrei Popescu-Belis, Erik Boertjes, Jonathan Kilgour, Peter Poller, Sandro Castronovo, Theresa Wilson, Alejandro Jaimes and Jean Carletta, in: Machine Learning for Multimodal Interaction V, Utrecht, Springer-Verlag, 2008

attachment

[DOI]

Towards Audio-Visual On-line Diarization Of Participants In Group Meetings, Hayley Hung and Gerald Friedland, in: European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion, 2008

attachment

ESTIMATING THE DOMINANT PERSON IN MULTI-PARTY CONVERSATIONS USING SPEAKER DIARIZATION STRATEGIES, Hayley Hung, Yan Huang, Gerald Friedland and Daniel Gatica-Perez, in: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2007

An Information Theoretic Approach to Speaker Diarization of Meeting Recordings, Deepu Vijayasenan, Ecole polytechnique fédérale de Lausanne, 2010

attachment

| 1 | 2 |

processing time: 0.0007 seconds.