Publication list - Idiap Publications

Recognizing the Visual Focus of Attention for Human Robot Interaction, Samira Sheikhi, Vasil Khalidov and Jean-Marc Odobez, in: IEEE International Conference on Intelligent Robots and Systems (IROS) - Human Behavior Understanding Workshop(IROS-HBU), 2012

Robot-to-group Interaction in a Vernissage: Architecture & Dataset for Multi-party Dialog, David Klotz, Johannes Wienke, Britta Wrede, Sebastian Wrede, Samira Sheikhi, Dinesh Babu Jayagopi, Vasil Khalidov and Jean-Marc Odobez, in: Proceedings of 5th International Conference on Cognitive Systems, 2012

Robust triphone mapping for acoustic modeling, Milos Cernak, David Imseng and Hervé Bourlard, in: Proceedings of Interspeech, Portland, Oregon, 2012

Socio-Technical Network Analysis from Wearable Interactions, Katayoun Farrahi, Remi Emonet and Alois Ferscha, in: International Symposium on Wearable Computers, 2012

Speaker Diarization and Linking of Large Corpora, Marc Ferras and Hervé Bourlard, in: Proceedings of the IEEE Workshop on Spoken Language Technology, 2012

Speaker Diarization of Meetings based on large TDOA feature vectors, Deepu Vijayasenan and Fabio Valente, in: Proceedings of International Conference on Acoustic, Speech and Signal Processing, 2012

Speaker diarization of overlapping speech based on silence distribution in meeting recordings, Sree Harsha Yella and Fabio Valente, in: INTERSPEECH, Portland, Oregon, USA, 2012

StressSense: Detecting Stress in Unconstrained Acoustic Environments using Smartphones, Hong Lu, Mashfiqui Rabbi, Gokul Chittaranjan, Denise Frauendorfer, Marianne Schmid Mast, Andrew T. Campbell, Daniel Gatica-Perez and Tanzeem Choudhury, in: Ubicomp'12, Pittsburgh, 2012

Structured Sparse Coding for Microphone Array Location Calibration, Afsaneh Asaei, Bhiksha Raj, Hervé Bourlard and Volkan Cevher, in: SAPA-SCALE Conference, The 5th ISCA workshop on statistical and perceptual audition, 2012

Sub-Band Based Log-Energy and its Dynamic Range Stretching for Robust In-Car Speech Recognition, Weifeng Li and Hervé Bourlard, in: Proceedings of the 13th Annual Conference of the International Speech Communication Association (InterSpeech), Portland, Oregon, 2012

Supervised and unsupervised Web-based language model domain adaptation, Gwénolé Lecorvé, John Dines, Thomas Hain and Petr Motlicek, in: Proceedings of Interspeech, Portland, Oregon, USA, pages to appear, 2012

Synthetic References for Template-based ASR using Posterior Features, Serena Soldo, Mathew Magimai-Doss and Hervé Bourlard, in: Proceedings of Interspeech, Portland, Oregon, USA, 2012

Template-based ASR using Posterior features and synthetic references: comparing different TTS systems, Serena Soldo, Mathew Magimai-Doss and Hervé Bourlard, in: SAPA-SCALE Conference, International Speech Communication Association, 2012

The Good, the Bad, and the Angry: Analyzing Crowdsourced Impressions of Vloggers, Joan-Isaac Biel and Daniel Gatica-Perez, in: Proceedings of AAAI International Conference on Weblogs and Social Media, 2012

The I4U Submission to the 2012 NIST Speaker Recognition Evaluation, Kong Aik Lee, Rahim Saedi, Tawfik Hasan, Tomi Kinnunen, Benoit Fauve, Pierre-Michel Bousquet, Elie Khoury, Pablo Luis Sordo Martinez, Tharmarajah Thiruvaran, Changhuai You, Padmanabhan Rajan, David Van Leeuwen, Seyed Omid Sadjadi, Driss Matrouf, Laurent El Shafey, John Mason, Eliathamby Ambikairajah, Hanwu Sun, Anthony Larcher, Bin Ma, Ville Hautamäki, Cemal Hanilci, Billy Braithwaite, Gonzalez-Hautamäki Rosa, Gang Liu, Hynek Boril, Navid Shokouhi, John Hansen, Jean-François Bonastre and Sébastien Marcel, in: NIST Speaker Recognition Conference, 2012

The Idiap Speaker Recognition Evaluation System at NIST SRE 2012, Elie Khoury, Laurent El Shafey and Sébastien Marcel, in: NIST Speaker Recognition Conference, NIST, Orlando, USA, 2012

The INTERSPEECH 2012 Speaker Trait Challenge, Björn Schuller, Stefan Steidl, Anton Batliner, Elmar Nöth, Alessandro Vinciarelli, Felix Burkhardt, Rob Van Son, felix Weninger, Florian Eyben, Tobias Bocklet, Gelareh Mohammadi and Benjamin Weiss, in: in Proceedings of INTERSPEECH, 2012

The Mobile Data Challenge: Big Data for Mobile Computing Research, J. K. Laurila, Daniel Gatica-Perez, I. Aad, Blom J., Olivier Bornet, Trinh-Minh-Tri Do, O. Dousse, J. Eberle and M. Miettinen, in: Pervasive Computing, Newcastle, 2012

Translating English Discourse Connectives into Arabic: a Corpus-based Analysis and an Evaluation Metric, Najeh Hajlaoui and Andrei Popescu-Belis, in: Fourth Workshop on Computational Approaches to Arabic Script-based Languages at Proceedings of the Tenth Biennial Conference of the Association for Machine Translation in the Americas (AMTA), 2012

Unsupervised Activity Analysis and Monitoring algorithms for Effective Surveillance Systems, Jean-Marc Odobez, C. Carincotte, Remi Emonet, E. Jouneau, Sofia Zaidenberg, Bertrand Raverra, Francois Bremond and Andrea Grifoni, in: European Conference on Computer Vision, 2012

Using Crowdsourcing to Compare Document Recommendation Strategies for Conversations, Maryam Habibi and Andrei Popescu-Belis, in: RecSys, Recommendation Utility Evaluation (RUE 2012), Dublin, Ireland, pages 15-20, 2012

Using KL-divergence and multilingual information to improve ASR for under-resourced languages, David Imseng, Hervé Bourlard and Philip N. Garner, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, pages 4869--4872, 2012

Using Self-Context for Multimodal Detection of Head Nods in Face-to-Face Interactions, Laurent Son Nguyen, Jean-Marc Odobez and Daniel Gatica-Perez, in: Proceedings of the 14th ACM International Conference on Multimodal Interaction, 2012

Using Sense-labeled Discourse Connectives for Statistical Machine Translation, Thomas Meyer and Andrei Popescu-Belis, in: Proceedings of the EACL2012 Workshop on Hybrid Approaches to Machine Translation (HyTra), Avignon, FR, pages 129--138, 2012

Using Sparse Classification Outputs as Feature Observations for Noise Robust ASR, Yang Sun, B. Cranen, Jort F. Gemmeke, Lou Boves, Louis ten Bosch and Mathew Magimai-Doss, in: Proceedings of Interspeech, 2012

We are not Contortionists: Coupled Adaptive Learning for Head and Body Orientation Estimation in Surveillance Video, Cheng Chen and Jean-Marc Odobez, in: IEEE International Conference on Computer Vision and Pattern Recognition, 2012

A Bimodal Sound Source Model for Vehicle Tracking in Traffic Monitoring, Patrick Marmaroli, Jean-Marc Odobez, Xavier Falourd and Hervé Lissek, in: European Signal Processing Conference, 2011

A BSS-based Approach for Localization of Simultaneous Speakers in Reverberant Conditions, Hamid Reza Abutalebi, Hedieh Heli, Danil Korchagin and Hervé Bourlard, in: Proceedings of the 19th European Signal Processing Conference (EUSIPCO), 2011

A Compressive Sensing Based Compressed Neural Network for Sound Source Localization, Mehdi Banitalebi Dehkordi, Hamid Reza Abutalebi and Hossein Ghanei, in: Proceedings of International Symposium on Artificial Intelligence and Signal Processing, 2011

A Corpus-based Contrastive Analysis for Defining Minimal Semantics of Inter-sentential Dependencies for Machine Translation, Thomas Meyer, Andrei Popescu-Belis, Jeevanthi Liyanapathirana and Bruno Cartoni, in: Proceedings of the GSCL2011 Workshop on "Contrastive Analysis - Translation Studies - Machine Translation: What can we learn from each other?", Hamburg, Germany, pages 5, 2011

A Joint Estimation of Head and Body Orientation Cues in Surveillance Video, Cheng Chen, Alexandre Heili and Jean-Marc Odobez, in: IEEE International Workshop on Socially Intelligent Surveillance and Monitoring, 2011

A Just-in-Time Document Retrieval System for Dialogues or Monologues, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, Portland, OR, pages 350-352, 2011

A Large-Scale Database of Images and Captions for Automatic Face Naming, Mert Ozcan, Jie Luo, Vittorio Ferrari and Barbara Caputo, in: Proceedings of the 22nd British Machine Vision Conference, 2011

A Speech-based Just-in-Time Retrieval System using Semantic Search, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), Portland, OR, pages 80-86, 2011

[URL]

An Audio Visual Corpus for Emergent Leader Analysis, Dairazalia Sanchez-Cortes, Oya Aran and Daniel Gatica-Perez, in: Multimodal Corpora for Machine Learning: Taking Stock and Road mapping the Future, 2011

An Integrated Framework for Multi-Channel Multi-Source Localization and Voice Activity Detection, Mohammad J. Taghizadeh, Philip N. Garner, Hervé Bourlard, Hamid Reza Abutalebi and Afsaneh Asaei, in: The Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2011

Analysis and Comparison of Recent MLP Features for LVCSR Systems, Fabio Valente, Mathew Magimai-Doss and Wen Wang, in: Proceedings of Interspeech 2011, 2011

Audio Spatio-Temporal Fingerprints for Cloudless Real-Time Hands-Free Diarization on Mobile Devices, Danil Korchagin, in: Proceedings of the 3rd Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, Edinburgh, UK, 2011

Automated Quantification of Morphodynamics for High-Throughput Live Cell Imaging Datasets, German Gonzalez, L. Fusco, Riwal Lefort, F. Benmansour, Pascal Fua and Kevin C. Smith, in: 1st International SystemsX.ch Conference on Systems Biology, 2011

Automatic Time Skew Detection and Correction, Danil Korchagin, in: Proceedings International Conference on Signal Acquisition and Processing, Singapore, 2011

Boosting with Maximum Adaptive Sampling, Charles Dubout and Francois Fleuret, in: Proceedings of the Neural Information Processing Systems Conference, 2011

Building 'directional corpora' for unbiased contrastive analysis, Bruno Cartoni and Thomas Meyer, in: Proceedings of Corpus Linguistics Conference, Birmingham, UK, pages 29-30, 2011

Combined Estimation of Location and Body Pose in Surveillance Video, Cheng Chen, Alexandre Heili and Jean-Marc Odobez, in: AVSS, 2011

Competition on Counter Measures to 2-D Facial Spoofing Attacks, Murali Mohan Chakka, André Anjos, Sébastien Marcel, Roberto Tronci, Daniele Muntoni, Gianluca Fadda, Maurizio Pili, Nicola Sirena, Gabriele Murgia, Marco Ristori, Fabio Roli, Junjie Yan, Dong Yi, Zhen Lei, Zhiwei Zhang, Stan Z.Li, William Robson Schwartz, Anderson Rocha, Helio Pedrini, Javier Lorenzo-Navarro, Modesto Castrillón-Santana, Jukka Maatta, Abdenour Hadid and Matti Pietikainen, in: Proceedings of IAPR IEEE International Joint Conference on Biometrics (IJCB), Washington DC, USA, 2011

Contextual grouping: discovering real-life interaction types from longitudinal Bluetooth data, Trinh-Minh-Tri Do and Daniel Gatica-Perez, in: 12th International Conference on Mobile Data Management, 2011

Cross-Lingual Speaker Discrimination Using Natural and Synthetic Speech, Mirjam Wester and Hui Liang, in: Proceedings of Interspeech, Florence, Italy, 2011

Deep Learning for Efficient Discriminative Parsing, Ronan Collobert, in: International Conference on Artificial Intelligence and Statistics, 2011

Detection-Based Multi-Human Tracking Using a CRF Model, Alexandre Heili, Cheng Chen and Jean-Marc Odobez, in: The Eleventh IEEE International Workshop on Visual Surveillance, 2011

Disambiguating discourse connectives using parallel corpora: senses vs. translations, Thomas Meyer, Charlotte Roze, Bruno Cartoni, Laurence Danlos, Sandrine Zufferey and Andrei Popescu-Belis, in: Proceedings of Corpus Linguistics Conference, Birmingham, UK, pages 104-105, 2011

Disambiguating Temporal-Contrastive Discourse Connectives for Machine Translation, Thomas Meyer, in: Proceedings of ACL-HLT 2011 Student Session, Association for Computational Linguistics, Portland, OR, pages 46--51, 2011