logo Idiap Research Institute        
Project IM2
Name: IM2

Publications of IM2 sorted by title
| 1 | 2 | 3 | 4 | 5 | 6 |


A

A Fast Parts-based Approach to Speaker Verification using Boosted Slice Classifiers, Anindya Roy, Mathew Magimai.-Doss and Sébastien Marcel, in: IEEE Transactions on Information Forensics and Security, 7(1):241-254, 2012
attachment
A Just-in-Time Document Retrieval System for Dialogues or Monologues, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, Portland, OR, pages 350-352, 2011
attachment
A Multimedia Retrieval System Using Speech Input, Andrei Popescu-Belis, Peter Poller, Jonathan Kilgour, Erik Boertjes, Jean Carletta, Sandro Castronovo, Michal Fapso, Alexandre Nanchen, Theresa Wilson, Joost de Wit and Majid Yazdani, in: Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction), Cambridge, MA, 2009
attachment
A Multimodal Corpus for Studying Dominance in Small Group Conversations, Oya Aran, Hayley Hung and Daniel Gatica-Perez, in: LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010, 2010
attachment
A Probabilistic Framework for Multiple Speaker Localization, Youssef Oualil, Mathew Magimai.-Doss, Friedrich Faubel and Dietrich Klakow, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013
attachment
A Random Walk Framework to Compute Textual Semantic Similarity: a Unified Model for Three Benchmark Tasks, Majid Yazdani and Andrei Popescu-Belis, in: Proceedings of the 4th IEEE International Conference on Semantic Computing (ICSC 2010 ), Carnegie Mellon University, Pittsburgh, PA, USA, 2010
attachment
A Speech-based Just-in-Time Retrieval System using Semantic Search, Andrei Popescu-Belis, Majid Yazdani, Alexandre Nanchen and Philip N. Garner, in: Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), Portland, OR, pages 80-86, 2011
[URL]
Accessing a Large Multimodal Corpus using an Automatic Content Linking Device, Andrei Popescu-Belis, Jean Carletta, Jonathan Kilgour and Peter Poller, in: Multimodal Corpora: From Models of Natural Interaction to Systems and Applications, Springer-Verlag, 2009
attachment
[DOI]
Ad-Hoc Microphone Array Calibration from Partial Distance Measurements, Mohammad J. Taghizadeh, Afsaneh Asaei, Philip N. Garner and Hervé Bourlard, in: Proceedings of the 4th Joint Workshop on Hands-free speech communication and Microphone Arrays, Villers-les-Nancy, pages 1 - 5, IEEE, 2014
attachment
[DOI]
An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, David Imseng and Gerald Friedland, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, pages 4946-4949, 2010
attachment
An Alternative Scanning Strategy to Detect Faces, Venkatesh Bala Subburaman and Sébastien Marcel, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010
attachment
An Information Theoretic Approach to Speaker Diarization of Meeting Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: IEEE Transactions on Audio Speech and Language Processing, 17(7), 2009
attachment
[DOI]
Analysis of Group Conversations: Modeling Social Verticality, Oya Aran and Daniel Gatica-Perez, in: Computer Analysis of Human Behavior, pages 293-322, Springer London, 2011
Analyzing Flickr Groups, Radu-Andrei Negoescu and Daniel Gatica-Perez, in: Proc. of the Intl. Conf. on Image and Video Retrieval, ACM, 2008
Application of Out-Of-Language Detection To Spoken-Term Detection, Petr Motlicek and Fabio Valente, in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010
attachment
APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, Sriram Ganapathy, Samuel Thomas, Petr Motlicek and Hynek Hermansky, in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09., IEEE, Mohonk Mountain House, New Paltz, New York, USA, 2009
attachment
[URL]
Assessing the Impact of Language Style on Emergent Leadership Perception from Ubiquitous Audio, Dairazalia Sanchez-Cortes, Petr Motlicek and Daniel Gatica-Perez, in: Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia, Ulm, Germany, 2012
attachment
Audio–Visual Synchronisation for Speaker Diarisation, Giulia Garau, Alfred Dielmann and Hervé Bourlard, in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, 2010
attachment
Automated Delineation of Dendritic Networks in Noisy Image Stacks, German Gonzalez, Francois Fleuret and Pascal Fua, in: proceedings of the European Conference on Computer Vision, 2008
Automatic detection of conflicts in spoken conversations: ratings and analysis of broadcast political debates, Samuel Kim, Fabio Valente and Alessandro Vinciarelli, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012
attachment
Automatic vs. human question answering over multimedia meeting recordings, Quoc Anh Le and Andrei Popescu-Belis, in: 10th Annual Conference of the International Speech Communication Association, 2009
attachment

B

BOOSTED BINARY FEATURES FOR NOISE-ROBUST SPEAKER VERIFICATION, Anindya Roy, Mathew Magimai.-Doss and Sébastien Marcel, in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, Texas, 2010
attachment
Boosting localized binary features for speech recognition, Anindya Roy, Mathew Magimai.-Doss and Sébastien Marcel, in: Symposium on Machine Learning in Speech and Language Processing (MLSLP), 2012
attachment
Boosting Localized Features for Speaker and Speech Recognition, Anindya Roy, Ecole Polytechnique Federale de Lausanne (EPFL), 2011
attachment
Boosting under-resourced speech recognizers by exploiting out of language data - Case study on Afrikaans, David Imseng, Hervé Bourlard and Philip N. Garner, in: Proceedings of the 3rd International Workshop on Spoken Languages Technologies for Under-resourced Languages, Cape Town, pages 60--67, 2012
attachment
Broadcasting oneself: Visual Discovery of Vlogging Styles, Oya Aran, Joan-Isaac Biel and Daniel Gatica-Perez, in: IEEE Transactions on Multimedia, 16(1):201-215, 2014
attachment
[DOI]

C

Capturing Order in Social Interactions, Alessandro Vinciarelli, in: IEEE Signal Processing Magazine, 2009
attachment
Context Aware Addressee Estimation for Human Robot Interaction, Samira Sheikhi, Dinesh Babu Jayagopi, Vasil Khalidov and Jean-Marc Odobez, in: Proceedings of the 6th Workshop on Eye Gaze in Intelligent Human Machine Interaction: Gaze in Multimodal Interaction, 2013
Contextual classification of image patches with latent aspect models, Florent Monay, Pedro Quelhas, Jean-Marc Odobez and Daniel Gatica-Perez, in: EURASIP Journal on Image and Video Processing, Special Issue on Patches in Vision, 2009
attachment
Convexity in source separation: Models, geometry, and algorithms, Michael McCoy, Volkan Cevher, Quoc Tran Dinh, Afsaneh Asaei and Luca Baldassarre, in: IEEE Signal Processing Magazine, Special Issue on Source Separation and Applications, 2013
attachment
Cross-Domain Personality Prediction: From Video Blogs to Small Group Meetings, Oya Aran and Daniel Gatica-Perez, in: 15th ACM International Conference on Multimodal Interaction, 2013
attachment

D

Diverse Keyword Extraction from Conversations, Maryam Habibi and Andrei Popescu-Belis, in: Proceedings of the ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics ), Short Papers, Sofia, Bulgaria, pages 651-657, ACL, 2013
attachment

E

Enabling speech applications using Ad-Hoc Microphone Arrays, Mohammad J. Taghizadeh, École Polytechnique Fédérale de Lausanne, 2015
attachment
Enforcing Topic Diversity in a Document Recommender for Conversations, Maryam Habibi and Andrei Popescu-Belis, in: Proceedings of the Coling 2014 (25th International Conference on Computational Linguistics), Dublin, Ireland, pages 746-759, IEEE, 2014
attachment
English Spoken Term Detection in Multilingual Recordings, Petr Motlicek, Fabio Valente and Philip N. Garner, in: Proceedings of Interspeech, Makuhari, Japan, 2010, ISCA, Makuhari, Japan, 2010
attachment
Evaluation of Meeting Support Technology, Simon Tucker and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 237-252, Cambridge University Press, 2012

F

Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition, David Imseng, Ramya Rasipuram and Mathew Magimai.-Doss, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Hawaii, USA, pages 348-353, 2011
attachment
Fast Bounding Box Estimation based Face Detection, Venkatesh Bala Subburaman and Sébastien Marcel, in: ECCV, Workshop on Face Detection: Where we are, and what next?, 2010
attachment
[URL]
Fast Speaker Verification on Mobile Phone data using Boosted Slice Classifiers, Anindya Roy, Mathew Magimai.-Doss and Sébastien Marcel, in: IAPR IEEE International Joint Conference on Biometrics, Washington DC, 2011
attachment
Floor Holder Detection and End of Speaker Turn Prediction in Meetings, Alfred Dielmann, Giulia Garau and Hervé Bourlard, in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, ISCA, 2010
attachment

G

Gender Classification by LUT based boosting of Overlapping Block Patterns, Rakesh Metha, Manuel Günther and Sébastien Marcel, in: Scandinavian Conference on Image Analysis, pages 530-542, Springer International Publishing, 2015
attachment
[DOI]
[URL]

H

Hi YouTube! Personality Impressions and Verbal Content in Social Video, Joan-Isaac Biel, Daniel Gatica-Perez, John Dines and Vagia Tsminiaki, in: 15th ACM International Conference on Multimodal Interaction, Sydney, Australia, ACM, 2013, 2013
attachment
Hierarchical Multilayer Perceptron based Language Identification, David Imseng, Mathew Magimai.-Doss and Hervé Bourlard, in: Proceedings of Interspeech, Makuhari, Japan, pages 2722-2725, 2010
attachment

I

Identifying Dominant People in Meetings from Audio-Visual Sensors, Hayley Hung and Daniel Gatica-Perez, in: International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, 2008
attachment
Implicit Human Centered Tagging, Alessandro Vinciarelli, Nicolae Suditu and Maja Pantic, in: Proceedings of IEEE Conference on Multimedia and Expo, 2009
attachment
IMPROVING ACOUSTIC BASED KEYWORD SPOTTING USING LVCSR LATTICES, Petr Motlicek, Fabio Valente and Igor Szoke, in: Proceedings on IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Japan, pages 4413-4416, 2012
Interactive Multimodal Information Management: Shaping the Vision, Andrei Popescu-Belis and Hervé Bourlard, in: Interactive Multimodal Information Management, pages 1-17, EPFL Press, 2013
attachment
Introducing Crossmodal Biometrics:Person Identification from Distinct Audio & Visual Streams, Anindya Roy and Sébastien Marcel, in: IEEE Fourth International Conference on Biometrics: Theory, Applications and Systems, 2010
attachment
Investigating the Impact of Language Style and Vocal Expression on Social Roles of Participants in Professional Meetings, A. Sapru and Hervé Bourlard, in: Affective Computing and Intelligent Interaction, Geneva, pages 324-329, IEEE, 2013
attachment
[DOI]
Investigating the use of Visual Focus of Attention for Audio-Visual Speaker Diarisation, Giulia Garau, Silèye O. Ba, Hervé Bourlard and Jean-Marc Odobez, in: Proceedings of the ACM International Conference on Multimedia, Beijing, China, 2009
attachment

J

Joint Pose Estimator and Feature Learning for Object Detection, Karim Ali, Francois Fleuret, David Hasler and Pascal Fua, in: Proceedings of the IEEE International Conference on Computer Vision, 2009

K

KL Realignment for Speaker Diarization with Multiple Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: 10th Annual Conference of the International Speech Communication Association, 2009

L

Language dependent universal phoneme posterior estimation for mixed language speech recognition, David Imseng, Hervé Bourlard, Mathew Magimai.-Doss and John Dines, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Prag, CZ, pages 5012-5015, 2011
attachment
Learning Large Margin Likelihood for Realtime Head Pose Tracking, Elisa Ricci and Jean-Marc Odobez, in: IEEE Int. Conference on Image Processing, Cairo, Egypt, IEEE, 2009
attachment
Learning Rotational Features for Filament Detection, German Gonzalez, Francois Fleuret and Pascal Fua, in: Proceedings of the IEEE international conference on Computer Vision and Pattern Recognition, 2009
Learning to learn new models of human activities in indoor settings1, Fabian Nater, Tatiana Tommasi, Luc Van Gool and Barbara Caputo, in: Interactive Multimodal Information Management, EPFL Press, 2013
Learning to learn new models of human activities in indoor settings1, Fabian Nater, Tatiana Tommasi, Luc Van Gool and Barbara Caputo, in: Interactive Multimodal Information Management, EPFL Press, 2013
attachment
Learning to Rank on Network Data, Majid Yazdani, Ronan Collobert and Andrei Popescu-Belis, in: Mining and Learning with Graphs, 2013
attachment
Leveraging speaker diarization for meeting recognition from distant microphones, Andreas Stolcke, Gerald Friedland and David Imseng, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4390--4393, 2010
Leveraging the robot dialog state for visual focus of attention recognition, Samira Sheikhi, Vasil Khalidov, David Klotz, Britta Wrede and Jean-Marc Odobez, in: Proceedings of the 15th ACM on International Conference on Multimodal Interaction, 2013

M

Managing Multimodal Data, Metadata and Annotations: Challenges and Solutions, Andrei Popescu-Belis, in: Multimodal Signal Processing for Human-Computer Interaction, Elsevier / Academic Press, 2009
Manifold Sparse Beamforming, Baran Gözcü, Afsaneh Asaei and Volkan Cevher, in: IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, Saint Martin, France, pages 113-116, IEEE, 2013
attachment
[DOI]
MediaParl: Bilingual mixed language accented speech database, David Imseng, Hervé Bourlard, Holger Caesar, Philip N. Garner, Gwénolé Lecorvé and Alexandre Nanchen, in: Proceedings of the 2012 IEEE Workshop on Spoken Language Technology, pages 263--268, 2012
attachment
Medical image annotation, Barbara Caputo, in: Interactive Multimodal Information Management, EPFL Press, 2013
attachment
Microphone Array Beampattern Characterization for Hands-free Speech Applications, Mohammad J. Taghizadeh, Philip N. Garner and Hervé Bourlard, in: IEEE 7th Sensor Array and Multichannel Signal Processing Workshop(SAM), Hoboken, NJ, USA, pages 473-476, 2012
attachment
MLP Based Hierarchical System for Task Adaptation in ASR, Joel Praveen Pinto, Mathew Magimai.-Doss and Hervé Bourlard, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, 2009
attachment
Model-based Sparse Component Analysis for Reverberant Speech Localization, Afsaneh Asaei, Hervé Bourlard, Mohammad J. Taghizadeh and Volkan Cevher, in: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1439 - 1443, IEEE, 2014
attachment
[DOI]
Modeling dominance effects on nonverbal behaviors using granger causality, Kyriaki Kalimeri, Bruno Lepri, Oya Aran, Dinesh Babu Jayagopi, Daniel Gatica-Perez and Fabio Pianesi, in: Proceedings of International Conference on Multimodal Interaction, ICMI 2012, Santa Monica, CA, 2012
attachment
Modulation Frequency Features For Phoneme Recognition In Noisy Speech, Sriram Ganapathy, Samuel Thomas and Hynek Hermansky, in: Journal of Acoustical Society of America - Express Letters, 2008
attachment
Multi-Camera People Tracking with a Probabilistic Occupancy Map, Francois Fleuret, Jerome Berclaz, Richard Lengagne and Pascal Fua, in: IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 2008
Multi-Camera Tracking and Atypical Motion Detection with Behavioral Maps, Jerome Berclaz, Francois Fleuret and Pascal Fua, in: proceedings of the European Conference on Computer Vision, 2008
Multi-layer Boosting for Pattern Recognition, Francois Fleuret, in: Pattern Recognition Letter, 30, 2009
Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):101-116, 2011
attachment
Multilingual speech recognition A posterior based approach, David Imseng, École Polytechnique Fédérale de Lausanne (EPFL), 2013
attachment
Multimodal Signal Processing for Meetings: an Introduction, Andrei Popescu-Belis and Jean Carletta, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 1-11, Cambridge University Press, 2012
attachment
Multistream Speaker Diarization beyond Two Acoustic Feature Streams, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: International Conference on Acoustics, Speech, and Signal Processing, 2010
attachment
MULTISTREAM SPEAKER DIARIZATION THROUGH INFORMATION BOTTLENECK SYSTEM OUTPUTS COMBINATION, Deepu Vijayasenan, Fabio Valente and Petr Motlicek, in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2011
attachment
MUTUAL INFORMATION BASED CHANNEL SELECTION FOR SPEAKER DIARIZATION OF MEETINGS DATA, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009
attachment
Mutual Information based Channel Selection for Speaker Diarization of Meetings Data, Deepu Vijayasenan, Fabio Valente and Hervé Bourlard, in: Proceedings of International conference on acoustics speech and signal processing, 2009

N

Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, Weifeng Li, John Dines, Mathew Magimai.-Doss and Hervé Bourlard, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009
attachment

O

One of a Kind: Inferring Personality Impressions in Meetings, Oya Aran and Daniel Gatica-Perez, in: 15th ACM International Conference on Multimodal Interaction, 2013
attachment

P

Phoneme Recognition using Boosted Binary Features, Anindya Roy, Mathew Magimai.-Doss and Sébastien Marcel, in: IEEE Intl. Conference on Acoustics, Speech and Signal Processing 2011, 2011
attachment
Posterior features applied to speech recognition tasks with user-defined vocabulary, Guillermo Aradilla, Hervé Bourlard and Mathew Magimai.-Doss, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009
attachment
Predicting Remote Versus Collocated Group Interactions using Nonverbal Cues, Dairazalia Sanchez-Cortes, Dinesh Babu Jayagopi and Daniel Gatica-Perez, in: Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing,, Cambridge, 2009
[DOI]
Principled Detection-by-classification from Multiple Views, Jerome Berclaz, Francois Fleuret and Pascal Fua, in: proceedings of the International Conference on Computer Vision Theory and Applications, 2008

R

Recognizing conversational context in group interaction using privacy-sensitive mobile sensors, Dinesh Babu Jayagopi, Taemie Kim, Alex Pentland and Daniel Gatica-Perez, in: Proceedings of International Conference on Mobile and Ubiquitous Multimedia, Limassol, Cyprus, 2010
attachment
Recognizing Human Visual Focus of Attention from Head Pose in Meetings, Silèye O. Ba and Jean-Marc Odobez, in: IEEE Transactions on Systems, Man, Cybernetics, Part-B, Vol. 39(No. 1), 2009
attachment
Recurrent Convolutional Neural Networks for Scene Labeling, Pedro H. O. Pinheiro and Ronan Collobert, in: 31st International Conference on Machine Learning (ICML), Beijing, China, pages 82-90, JMLR, 2014
attachment
[URL]
Reference-based vs. task-based evaluation of human language technology, Andrei Popescu-Belis, in: LREC 2008 ELRA Workshop on Evaluation, ELRA, Marrakech, Morocco, 2008
attachment
Robust Speaker Diarization for Short Speech Recordings, David Imseng and Gerald Friedland, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, pages 432-437, 2009
attachment

S

SNR Features for Automatic Speech Recognition, Philip N. Garner, in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, 2009
attachment
Social Signal Processing: Understanding Nonverbal Communication in Social Interactions, Alessandro Vinciarelli and Fabio Valente, in: Proceedings of Measuring Behavior 2010, Eindhoven (The Netherlands), 2010
attachment
Social Signals, their Function, and Automatic Analysis: A Survey, Alessandro Vinciarelli, Maja Pantic, Hervé Bourlard and Alex Pentland, in: Proceedings of International Conference on Multimodal Interfaces (to appear), 2008
attachment
Speaker Diarization, Fabio Valente and Gerald Friedland, in: Multimodal Signal Processing: Human Interactions in Meetings, Cambridge University Press, 2012
[URL]
Speaker Diarization of Meetings based on Speaker Role N-gram Models, Fabio Valente, Deepu Vijayasenan and Petr Motlicek, in: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
attachment
Speech Processing, Mathew Magimai.-Doss, in: Interactive Multimodal Information Management, pages 221--245, EPFL Press, 2013
Stationary Features and Cat Detection, Francois Fleuret and Donald Geman, in: Journal of Machine Learning Research, 9, 2008
Steerable Features for Statistical 3D Dendrite Detection, German Gonzalez, Francois Aguet, Francois Fleuret, Michael Unser and Pascal Fua, in: Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention, 2009
Structure and appearance features for robust 3D facial actions tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: IEEE Proc. Int. Conf. on Multimedia and Expo, IEEE, 2009
attachment

T

Task-based evaluation of meeting browsers: from BET task elicitation to user behavior analysis, Andrei Popescu-Belis, Mike Flynn, Pierre Wellner and Philippe Baudrion, in: 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 2008
attachment
The ACLD: Speech-based Just-in-Time Retrieval of Meeting Transcripts, Documents and Websites, Andrei Popescu-Belis, Jonathan Kilgour, Alexandre Nanchen and Peter Poller, in: ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Florence, Italy, 2010
attachment
The Good, the Bad, and the Angry: Analyzing Crowdsourced Impressions of Vloggers, Joan-Isaac Biel and Daniel Gatica-Perez, in: Proceedings of AAAI International Conference on Weblogs and Social Media, 2012
attachment
The ICSI RT-09 Speaker Diarization System, Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera, Luke Gottlieb, Marijn Huijbregts, Mary Tai Knox and Oriol Vinyals, in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):371--381, 2012
[DOI]
The Kaldi Speech Recognition Toolkit, Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer and Karel Vesely, in: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEE Signal Processing Society, 2011
attachment
Topickr: Flickr Groups and Users Reloaded, Radu-Andrei Negoescu and Daniel Gatica-Perez, in: MM '08: Proc. of the 16th ACM Intl. Conf. on Multimedia, ACM, 2008
Towards Audio-Visual On-line Diarization Of Participants In Group Meetings, Hayley Hung and Gerald Friedland, in: European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion, 2008
attachment
Towards mixed language speech recognition systems, David Imseng, Hervé Bourlard and Mathew Magimai.-Doss, in: Proceedings of Interspeech, Makuhari, Japan, pages 278-281, 2010
attachment
Tracter: A Lightweight Dataflow Framework, Philip N. Garner and John Dines, in: Proceedings of Interspeech, Makuhari, Japan, 2010
attachment
Tuning-Robust Initialization Methods for Speaker Diarization, David Imseng and Gerald Friedland, in: IEEE Transactions on Audio, Speech, and Language Processing, 18(8):2028-2037, 2010
attachment
[DOI]

U

User Requirements for Meeting Support Technology, Denis Lalanne and Andrei Popescu-Belis, in: Multimodal Signal Processing: Human Interactions in Meetings, pages 210-221, Cambridge University Press, 2012
Using Audio and Visual Cues for Speaker Diarisation Initialisation, Giulia Garau and Hervé Bourlard, in: International Conference on Acoustics, Speech and Signal Processing, 2010
attachment
Using Crowdsourcing to Compare Document Recommendation Strategies for Conversations, Maryam Habibi and Andrei Popescu-Belis, in: RecSys, Recommendation Utility Evaluation (RUE 2012), Dublin, Ireland, pages 15-20, 2012
attachment
Using KL-divergence and multilingual information to improve ASR for under-resourced languages, David Imseng, Hervé Bourlard and Philip N. Garner, in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, pages 4869--4872, 2012
attachment

V

View-Based Appearance Model Online Learning for 3D Deformable Face Tracking, Stéphanie Lefèvre and Jean-Marc Odobez, in: Proc. Int. Conf. on Computer Vision Theory and Applications, Angers, 2010
attachment
Vlogcast Yourself: Nonverbal Behavior and Attention in Social Media, Joan-Isaac Biel and Daniel Gatica-Perez, in: Proceedings International Conference on Multimodal Interfaces (ICMI-MLMI), 2010
attachment
VlogSense: Conversational Behavior and Social Attention in YouTube, Joan-Isaac Biel and Daniel Gatica-Perez, in: Transactions on Multimedia Computing, Communications and Applications, 2011
attachment
Voices of Vlogging, Joan-Isaac Biel and Daniel Gatica-Perez, in: Proceedings of AAAI International Conference on Weblogs and Social Media, Washington DC, 2010
attachment
Volterra Series for Analyzing MLP based Phoneme Posterior Probability Estimator, Joel Praveen Pinto, G. S. V. S. Sivaram, Hynek Hermansky and Mathew Magimai.-Doss, in: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2009
attachment

W

Wearing a YouTube hat: directors, comedians, gurus, and user aggregated behavior, Joan-Isaac Biel and Daniel Gatica-Perez, in: Proceedings of the 17th ACM International Conference on Multimedia, ACM, 2009
attachment

Y

You Are Known by How You Vlog: Personality Impressions and Nonverbal Behavior in YouTube, Joan-Isaac Biel, Oya Aran and Daniel Gatica-Perez, in: Proceedings of AAAI International Conference on Weblogs and Social Media, Barcelona, 2011
attachment
| 1 | 2 | 3 | 4 | 5 | 6 |