Publications of IM2 sorted by journal and type
Publications of type Idiap-RR
2017
From Research to Reality: Evaluation of a Single-Computer Real-Time LVCSR System for Speech-Based Retrieval, , , and , Idiap-RR-12-2017 |
|
2015
Joint Similarity Learning for Predicting Links in Networks with Multiple-type Links, and , Idiap-RR-29-2015 |
|
2014
Feature Mapping of Multiple Beamformed Sources for Robust Overlapping Speech Recognition Using a Microphone Array, , , , , , and , Idiap-RR-17-2014 |
|
2013
Comparing different acoustic modeling techniques for multilingual boosting, , , , and , Idiap-RR-01-2013 |
|
End-to-end Phoneme Sequence Recognition using Convolutional Neural Networks, , and , Idiap-RR-40-2013 |
|
Estimating Phoneme Class Conditional Probabilities from Raw Speech Signal using Convolutional Neural Networks, , and , Idiap-RR-13-2013 |
|
MediaParl: Bilingual mixed language accented speech database, , , , , and , Idiap-RR-03-2013 |
|
Recurrent Convolutional Neural Networks for Scene Labeling, and , Idiap-RR-41-2013 |
|
Recurrent Convolutional Neural Networks for Scene Parsing, and , Idiap-RR-22-2013 |
|
Using out-of-language data to improve an under-resourced speech recognizer, , , and , Idiap-RR-09-2013 |
|
2012
Automatic Social Role Recognition In Professional Meetings, and , Idiap-RR-35-2012 |
|
Boosting under-resourced speech recognizers by exploiting out of language data - Case study on Afrikaans, , and , Idiap-RR-15-2012 |
|
Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition, , and , Idiap-RR-01-2012 |
|
IMPROVING ACOUSTIC BASED KEYWORD SPOTTING USING LVCSR LATTICES, , and , Idiap-RR-36-2012 |
|
Improving Object Classification using Pose Information, , , and , Idiap-RR-30-2012 |
|
Using Crowdsourcing to Compare Document Recommendation Strategies for Conversations, and , Idiap-RR-14-2012 |
|
2011
A Speech-based Just-in-Time Retrieval System using Semantic Search, , , and , Idiap-RR-31-2011 |
|
AN INTEGRATED FRAMEWORK FOR MULTI-CHANNEL MULTI-SOURCE LOCALIZATION AND VOICE ACTIVITY DETECTION, , , , and , Idiap-RR-16-2011 |
|
BROADBAND BEAMPATTERN FOR MULTI-CHANNEL SPEECH ACQUISITION AND DISTANT SPEECH RECOGNITION, , and , Idiap-RR-39-2011 |
|
Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition., , Idiap-RR-15-2011 |
|
Continuous Speech Recognition using Boosted Binary Features, , and , Idiap-RR-35-2011 |
|
Finding Information in Multimedia Records of Meetings, , and , Idiap-RR-32-2011 |
|
IMPROVING MICROPHONE ARRAY SPEECH RECOGNITION WITH COCHLEAR IMPLANT-LIKE SPECTRALLY REDUCED SPEECH, , and , Idiap-RR-40-2011 |
|
Improving non-native ASR through stochastic multilingual phoneme space transformations, , , , and , Idiap-RR-19-2011 |
|
Language dependent universal phoneme posterior estimation for mixed language speech recognition, , , and , Idiap-RR-13-2011 |
|
When Users Meet Technology: The Meeting Browser Development Helix, , and , Idiap-RR-05-2011 |
|
2010
Advances in Fast Multistream Diarization based on the Information Bottleneck Framework, , and , Idiap-RR-23-2010 |
|
An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, and , Idiap-RR-02-2010 |
|
An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization, , and , Idiap-RR-22-2010 |
|
Crossmodal Matching of Speakers using Lip and Voice Features in Temporally Non-overlapping Audio and Video Streams, and , Idiap-RR-13-2010 |
|
English Spoken Term Detection in Multilingual Recordings, , and , Idiap-RR-21-2010 |
|
Estimating Cohesion in Small Groups using Audio-Visual Nonverbal Behavior, and , Idiap-RR-12-2010 |
|
Evaluating the Robustness of Privacy-Sensitive Audio Features for Speech Detection in Personal Audio Log Scenarios, , , and , Idiap-RR-01-2010 |
|
Fast Bounding Box Estimation based Face Detection, and , Idiap-RR-38-2010 |
|
Hierarchical Multilayer Perceptron based Language Identification, , and , Idiap-RR-14-2010 |
|
Hierarchical Tandem Features for ASR in Mandarin, , and , Idiap-RR-39-2010 |
|
Introducing Crossmodal Biometrics: Person Identification from Distinct Audio & Visual Streams, and , Idiap-RR-29-2010 |
|
KL Realignment for Speaker Diarization with Multiple Feature Streams, , and , Idiap-RR-24-2010 |
|
Modeling and Understanding Flickr Communities through Topic-based Analysis, and , Idiap-RR-19-2010 |
|
The ACLD: Speech-based Just-in-Time Retrieval of Multimedia Documents and Websites, , , and , Idiap-RR-26-2010 |
|
Towards mixed language speech recognition systems, , and , Idiap-RR-15-2010 |
|
Tracter: A Lightweight Dataflow Framework, and , Idiap-RR-10-2010 |
|
Tuning-Robust Initialization Methods for Speaker Diarization, and , Idiap-RR-35-2010 |
|
2009
A MAP Approach to Noise Compensation of Speech, , Idiap-RR-08-2009 |
|
APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, , , and , Idiap-RR-35-2009 |
|
Automatic Out-of-Language Detection based on Confidence Measures derived from LVCSR Word and Phone Lattices, , Idiap-RR-06-2009 |
|
Automatic vs. human question answering over multimedia meeting recordings, and , Idiap-RR-13-2009 |
|
ClusterRank: A Graph Based Method for Meeting Summarization, , , and , Idiap-RR-09-2009 |
|
Comparing meeting browsers using a task-based evaluation method, , Idiap-RR-11-2009 |
|
Haar Local Binary Pattern Feature for Fast Illumination Invariant Face Detection, and , Idiap-RR-28-2009 |
|
Investigating Privacy-Sensitive Features for Speech Detection in Multiparty Conversations, , , and , Idiap-RR-12-2009 |
|
Multiple Object Tracking using Flow Linear Programming, , and , Idiap-RR-10-2009 |
|
Novel initialization methods for Speaker Diarization, , Idiap-RR-07-2009 |
|
On Joint Modelling of Grapheme and Phoneme Information using KL-HMM for ASR, , and , Idiap-RR-24-2009 |
|
Real-Time ASR from Meetings, , , , , , , , and , Idiap-RR-15-2009 |
|
Robust Speaker Diarization for Short Speech Recordings, and , Idiap-RR-26-2009 |
|
SNR Features for Automatic Speech Recognition, , Idiap-RR-25-2009 |
|
Speaker Change Detection with Privacy-Preserving Audio Cues, , , and , Idiap-RR-23-2009 |
|
Speech/Non-Speech Detection in Meetings from Automatically Extracted Low Resolution Visual Features, and , Idiap-RR-20-2009 |
|
User Interface Design in a Just-in-time Retrieval System for Meetings, , , , , , and , Idiap-RR-38-2009 |
|
Visual activity context for focus of attention estimation in dynamic meetings, , and , Idiap-RR-02-2009 |
|
2008
Entropy coding of Quantized Spectral Components in FDLP audio codec, , and , Idiap-RR-71-2008 |
|
Identifying Dominant People in Meetings from Audio-Visual Sensors, and , Idiap-RR-65-2008 |
|
Kernel Based Text-Independnent Speaker Verification, , and , Idiap-RR-68-2008 |
|
Low-Delay Error Resilient Speech Coding Using Sub-band Hilbert Envelopes, , and , Idiap-RR-75-2008 |
|
MODIFIED DISCRETE COSINE TRANSFORM FOR ENCODING RESIDUAL SIGNALS IN FREQUENCY DOMAIN LINEAR PREDICTION, , and , Idiap-RR-74-2008 |
|
Modulation Frequency Features For Phoneme Recognition In Noisy Speech, , and , Idiap-RR-70-2008 |
|
Multi-layer Boosting for Pattern Recognition, , Idiap-RR-76-2008 |
|
Volterra Series for Analyzing MLP based Phoneme Posterior Probability Estimator, , , and , Idiap-RR-69-2008 |
|
Publications of type Idiap-Com
2013
Who Wants To Be A Millionaire? (II), , and , Idiap-Com-02-2013 |
|
2011
Face Detection using Ferns, and , Idiap-Com-01-2011 |
|
2010
Finding without searching, , Idiap-Com-01-2010 |
|
Signal Processing
Ad Hoc Microphone Array Calibration: Euclidean Distance Matrix Completion Algorithm and Theoretical Guarantees, , , , and , in: Signal Processing, 107:123–140, 2015 |
[DOI] |
IEEE Transactions Affective Computing
What Your Face Vlogs About: Expressions of Emotion and Big-Five Traits Impressions in YouTube, , , and , in: IEEE Transactions Affective Computing, 2014 |
|
IEEE Transactions on Multimedia
Broadcasting oneself: Visual Discovery of Vlogging Styles, , and , in: IEEE Transactions on Multimedia, 16(1):201-215, 2014 |
[DOI] |
Mining Crowdsourced First Impressions in Online Social Video, and , in: IEEE Transactions on Multimedia, 16(7), 2014 |
|
Signal Processing
Enhanced Diffuse Field Model for Ad Hoc Microphone Array Calibration, , and , in: Signal Processing, 101:242-255, 2014 |
|
15th ACM International Conference on Multimodal Interaction, Sydney, Australia, ACM, 2013
Hi YouTube! Personality Impressions and Verbal Content in Social Video, , , and , in: 15th ACM International Conference on Multimodal Interaction, Sydney, Australia, ACM, 2013, 2013 |
|
Artificial Intelligence Journal
Computing Text Semantic Relatedness using the Contents and Links of a Hypertext Encyclopedia, and , in: Artificial Intelligence Journal, 194:176–202, 2013 |
[DOI] |
IEEE Signal Processing Letters
A Savitzky-Golay Filtering Perspective of Dynamic Feature Computation, , and , in: IEEE Signal Processing Letters, 20(3):281 -- 284, 2013 |
[DOI] |
IEEE Signal Processing Magazine, Special Issue on Source Separation and Applications
Convexity in source separation: Models, geometry, and algorithms, , , , and , in: IEEE Signal Processing Magazine, Special Issue on Source Separation and Applications, 2013 |
|
IEEE Transactions on Audio, Speech, and Language Processing
Applying multi- and cross-lingual stochastic phone space transformations to non-native speech recognition, , , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 2013 |
[DOI] |
Speech Communication
Using out-of-language data to improve an under-resourced speech recognizer, , , and , in: Speech Communication, 2013 |
[DOI] [URL] |
IEEE Multimedia
Finding Information in Multimedia Records of Meetings, , and , in: IEEE Multimedia, 19(2):48-57, 2012 |
[DOI] [URL] |
IEEE Transactions on Audio, Speech, and Language Processing
The ICSI RT-09 Speaker Diarization System, , , , , , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):371--381, 2012 |
[DOI] |
Transcribing meetings with the AMIDA systems, , , , , , , , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 20(2):486--498, 2012 |
[DOI] [URL] |
IEEE Transactions on Information Forensics and Security
A Fast Parts-based Approach to Speaker Verification using Boosted Slice Classifiers, , and , in: IEEE Transactions on Information Forensics and Security, 7(1):241-254, 2012 |
|
IEEE Transactions on Multimedia
The YouTube Lens: Crowdsourced Personality Impressions and Audiovisual Analysis of Vlogs, and , in: IEEE Transactions on Multimedia, 2012 |
|
Speech Communication
Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features, , and , in: Speech Communication, 54(1), 2012 |
[DOI] |
Phase AutoCorrelation (PAC) features for noise robust speech recognition, , , and , in: Speech Communication, 54(7):867–880, 2012 |
[DOI] |
Computer Speech and Language
Automatic Identification of Discourse Markers in Multiparty Dialogues: An In-Depth Study of Like and Well, and , in: Computer Speech and Language, 25(3):499-518, 2011 |
[DOI] |
IEEE Trans. on Pattern Analysis and Machine Intelligence
Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues, and , in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):101-116, 2011 |
|
IEEE Transcations on Audio, Speech, and Language Processing
Analysis of MLP Based Hierarchical Phoneme Posterior Probability Estimator, , , , and , in: IEEE Transcations on Audio, Speech, and Language Processing, 19(2):225-241, 2011 |
|
Sadhana
Current trends in multilingual speech processing, , , , , , , , and , in: Sadhana, 36(5):885–915, 2011 |
[DOI] [URL] |
Speech Communication
Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition, , in: Speech Communication, 53(8):991--1001, 2011 |
[DOI] |
Springer Multimedia Systems Journal
Privacy-sensitive recognition of group conversational context with sociometers, , , and , in: Springer Multimedia Systems Journal, 2011 |
|
Transactions on Multimedia Computing, Communications and Applications
VlogSense: Conversational Behavior and Social Attention in YouTube, and , in: Transactions on Multimedia Computing, Communications and Applications, 2011 |
|
IEEE Transactions on Audio, Speech, and Language Processing
Tuning-Robust Initialization Methods for Speaker Diarization, and , in: IEEE Transactions on Audio, Speech, and Language Processing, 18(8):2028-2037, 2010 |
[DOI] |
IEEE Transactions on Multimedia
Mining group nonverbal conversational patterns using probabilistic topic models, and , in: IEEE Transactions on Multimedia, 2010 |
|
EURASIP Journal on Image and Video Processing, Special Issue on Patches in Vision
Contextual classification of image patches with latent aspect models, , , and , in: EURASIP Journal on Image and Video Processing, Special Issue on Patches in Vision, 2009 |
|
IEEE Signal Processing Magazine
Capturing Order in Social Interactions, , in: IEEE Signal Processing Magazine, 2009 |
|
IEEE Transactions on Audio Speech and Language Processing
An Information Theoretic Approach to Speaker Diarization of Meeting Data, , and , in: IEEE Transactions on Audio Speech and Language Processing, 17(7), 2009 |
[DOI] |
IEEE Transactions on Multimedia
Automatic Role Recognition in Multiparty Recordings: Using Social Affiliation Networks for Feature Extraction, , and , in: IEEE Transactions on Multimedia, 11(7), 2009 |
|
IEEE Transactions on Systems, Man, Cybernetics, Part-B
Recognizing Human Visual Focus of Attention from Head Pose in Meetings, and , in: IEEE Transactions on Systems, Man, Cybernetics, Part-B, Vol. 39(No. 1), 2009 |
|
Image and Vision Computing
Social Signal Processing: Survey of an Emerging Domain, , and , in: Image and Vision Computing, 2009 |
|
Linguistica Antverpiensia New Series
The FEMTI guidelines for contextual MT evaluation: principles and tools, , and , in: Linguistica Antverpiensia New Series, 8, 2009 |
Pattern Recognition Letter
Multi-layer Boosting for Pattern Recognition, , in: Pattern Recognition Letter, 30, 2009 |
IEEE Trans. on Pattern Analysis and Machine Intelligence
Tracking the visual focus of attention for a varying number of wandering people, , , and , in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 30(7), 2008 |
|
IEEE Transactions on Audio, Speech and Language Processing
Modeling Dominance in Group Conversations using NonVerbal Activity Cues, , , and , in: IEEE Transactions on Audio, Speech and Language Processing, 2008 |
|
IEEE Transactions on Biomedical Engineering
Fast Recognition of Anticipation Related Potentials, , and , in: IEEE Transactions on Biomedical Engineering, 2008 |
|
IEEE Transactions on Pattern Analysis and Machine Intelligence
Classification-based Probabilistic Modeling of Texture Transition for Fast Line Search Tracking and Delineation, , , and , in: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008 |
Multi-Camera People Tracking with a Probabilistic Occupancy Map, , , and , in: IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 2008 |
Journal of Acoustical Society of America - Express Letters
Modulation Frequency Features For Phoneme Recognition In Noisy Speech, , and , in: Journal of Acoustical Society of America - Express Letters, 2008 |
|
Journal of Machine Learning Research
Stationary Features and Cat Detection, and , in: Journal of Machine Learning Research, 9, 2008 |
Language Resources and Evaluation
Dimensionality of Dialogue Act Tagsets: An Empirical Analysis of Large Corpora, , in: Language Resources and Evaluation, 42(1), 2008 |
[DOI] |
Publications of type Book
2013
Interactive Multimodal Information Management, and , EPFL Press, 2013 |
2012
Multimodal Signal Processing: Human Interactions in Meetings, , , and , Cambridge University Press, 2012 |
[URL] |
2008
Machine Learning for Multimodal Interaction IV, , and , Springer-Verlag, LNCS, volume 4892, 2008 |
[DOI] |
Machine Learning for Multimodal Interaction V, and , Springer-Verlag, LNCS, volume 5237, 2008 |
[DOI] |
Interactive Multimodal Information Management (2013)
Interactive Multimodal Information Management: Shaping the Vision, and , in: Interactive Multimodal Information Management, pages 1-17, EPFL Press, 2013 |
|
Learning to learn new models of human activities in indoor settings1, , , and , in: Interactive Multimodal Information Management, EPFL Press, 2013 |
Learning to learn new models of human activities in indoor settings1, , , and , in: Interactive Multimodal Information Management, EPFL Press, 2013 |
|
Medical image annotation, , in: Interactive Multimodal Information Management, EPFL Press, 2013 |
|
Speech Processing, , in: Interactive Multimodal Information Management, pages 221--245, EPFL Press, 2013 |
Multimodal Signal Processing: Human Interactions in Meetings (2012)
Evaluation of Meeting Support Technology, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 237-252, Cambridge University Press, 2012 |
Multimodal Signal Processing for Meetings: an Introduction, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 1-11, Cambridge University Press, 2012 |
|
Speaker Diarization, and , in: Multimodal Signal Processing: Human Interactions in Meetings, Cambridge University Press, 2012 |
[URL] |
User Requirements for Meeting Support Technology, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 210-221, Cambridge University Press, 2012 |
Computer Analysis of Human Behavior (2011)
Analysis of Group Conversations: Modeling Social Verticality, and , in: Computer Analysis of Human Behavior, pages 293-322, Springer London, 2011 |
Social Media Computing (2011)
Call me Guru: user categories and large-scale behavior in YouTube, and , in: Social Media Computing, Springer, 2011 |
|
Multimodal Corpora: From Models of Natural Interaction to Systems and Applications (2009)
Accessing a Large Multimodal Corpus using an Automatic Content Linking Device, , , and , in: Multimodal Corpora: From Models of Natural Interaction to Systems and Applications, Springer-Verlag, 2009 |
[DOI] |
Multimodal Signal Processing for Human-Computer Interaction (2009)
Managing Multimodal Data, Metadata and Annotations: Challenges and Solutions, , in: Multimodal Signal Processing for Human-Computer Interaction, Elsevier / Academic Press, 2009 |
Machine Learning for Multimodal Interaction IV (2008)
Towards an Objective Test for Meeting Browsers: the BET4TQB Pilot Experiment, , , and , in: Machine Learning for Multimodal Interaction IV, Springer-Verlag, 2008 |
[DOI] |
Scandinavian Conference on Image Analysis (2015)
Gender Classification by LUT based boosting of Overlapping Block Patterns, , and , in: Scandinavian Conference on Image Analysis, pages 530-542, Springer International Publishing, 2015 |
[DOI] [URL] |
Proceedings of the 4th Joint Workshop on Hands-free speech communication and Microphone Arrays (2014)
Ad-Hoc Microphone Array Calibration from Partial Distance Measurements, , , and , in: Proceedings of the 4th Joint Workshop on Hands-free speech communication and Microphone Arrays, Villers-les-Nancy, pages 1 - 5, IEEE, 2014 |
[DOI] |
Proceedings of Interspeech (2014)
Automatic Speech Recognition and Translation of a Swiss German Dialect: Walliserdeutsch, , and , in: Proceedings of Interspeech, 2014 |
|
Detecting speaker roles and topic changes in multiparty conversations using latent topic models, and , in: Proceedings of Interspeech, 2014 |
|
Proceedings of the Coling 2014 (25th International Conference on Computational Linguistics) (2014)
Enforcing Topic Diversity in a Document Recommender for Conversations, and , in: Proceedings of the Coling 2014 (25th International Conference on Computational Linguistics), Dublin, Ireland, pages 746-759, IEEE, 2014 |
|
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (2014)
Model-based Sparse Component Analysis for Reverberant Speech Localization, , , and , in: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1439 - 1443, IEEE, 2014 |
[DOI] |
31st International Conference on Machine Learning (ICML) (2014)
Recurrent Convolutional Neural Networks for Scene Labeling, and , in: 31st International Conference on Machine Learning (ICML), Beijing, China, pages 82-90, JMLR, 2014 |
[URL] |
Proceedings of the ACM International Conference on Multimedia (2014)
The Workshop on Computational Personality Recognition 2014, , , , , and , in: Proceedings of the ACM International Conference on Multimedia, 2014 |
|
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013)
A Probabilistic Framework for Multiple Speaker Localization, , , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013 |
|
Proceedings of Interspeech (2013)
Automatic Social Role Recognition In Professional Meetings Using Conditional Random Fields, and , in: Proceedings of Interspeech, 2013 |
|
International Joint Conference on artificial intelligence (2013)
Computing Text Semantic Relatedness using the Contents and Links of a Hypertext Encyclopedia, and , in: International Joint Conference on artificial intelligence, 2013 |
|
Proceedings of the 6th Workshop on Eye Gaze in Intelligent Human Machine Interaction: Gaze in Multimodal Interaction (2013)
Context Aware Addressee Estimation for Human Robot Interaction, , , and , in: Proceedings of the 6th Workshop on Eye Gaze in Intelligent Human Machine Interaction: Gaze in Multimodal Interaction, 2013 |
15th ACM International Conference on Multimodal Interaction (2013)
Cross-Domain Personality Prediction: From Video Blogs to Small Group Meetings, and , in: 15th ACM International Conference on Multimodal Interaction, 2013 |
|
Proceedings of the ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics ), Short Papers (2013)
Diverse Keyword Extraction from Conversations, and , in: Proceedings of the ACL 2013 (51th Annual Meeting of the Association for Computational Linguistics ), Short Papers, Sofia, Bulgaria, pages 651-657, ACL, 2013 |
|
Proceedings of Interspeech (2013)
Estimating Phoneme Class Conditional Probabilities from Raw Speech Signal using Convolutional Neural Networks, , and , in: Proceedings of Interspeech, 2013 |
|
Proceedings IEEE International Conference On Digital Signal Processing (2013)
Euclidean Distance Matrix Completion for Ad-hoc Microphone Array Calibration, , , and , in: Proceedings IEEE International Conference On Digital Signal Processing, 2013 |
|
Proceedings of IEEE TENCON (2013)
Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition, , and , in: Proceedings of IEEE TENCON, 2013 |
|
Affective Computing and Intelligent Interaction (2013)
Investigating the Impact of Language Style and Vocal Expression on Social Roles of Participants in Professional Meetings, and , in: Affective Computing and Intelligent Interaction, Geneva, pages 324-329, IEEE, 2013 |
[DOI] |
Mining and Learning with Graphs (2013)
Learning to Rank on Network Data, , and , in: Mining and Learning with Graphs, 2013 |
|
Proceedings of the 15th ACM on International Conference on Multimodal Interaction (2013)
Leveraging the robot dialog state for visual focus of attention recognition, , , , and , in: Proceedings of the 15th ACM on International Conference on Multimodal Interaction, 2013 |
IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (2013)
Manifold Sparse Beamforming, , and , in: IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, Saint Martin, France, pages 113-116, IEEE, 2013 |
[DOI] |
15th ACM International Conference on Multimodal Interaction (2013)
One of a Kind: Inferring Personality Impressions in Meetings, and , in: 15th ACM International Conference on Multimodal Interaction, 2013 |
|
Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia (2012)
Assessing the Impact of Language Style on Emergent Leadership Perception from Ubiquitous Audio, , and , in: Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia, Ulm, Germany, 2012 |
|
INTERSPEECH (2012)
Automatic detection of conflict escalation in spoken conversations, , and , in: INTERSPEECH, ISCA, Portland, Oregon, USA., 2012 |
|
Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (2012)
Automatic detection of conflicts in spoken conversations: ratings and analysis of broadcast political debates, , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012 |
|
Symposium on Machine Learning in Speech and Language Processing (MLSLP) (2012)
Boosting localized binary features for speech recognition, , and , in: Symposium on Machine Learning in Speech and Language Processing (MLSLP), 2012 |
|
Proceedings of the 3rd International Workshop on Spoken Languages Technologies for Under-resourced Languages (2012)
Boosting under-resourced speech recognizers by exploiting out of language data - Case study on Afrikaans, , and , in: Proceedings of the 3rd International Workshop on Spoken Languages Technologies for Under-resourced Languages, Cape Town, pages 60--67, 2012 |
|
Proceedings of Interspeech (2012)
Combination of Sparse Classification and Multilayer Perceptron for Noise Robust ASR, , , , , and , in: Proceedings of Interspeech, 2012 |
|
Proceedings of the IEEE Workshop on Spoken Language Technology (2012)
COMBINING CEPSTRAL NORMALIZATION AND COCHLEAR IMPLANT-LIKE SPEECH PROCESSING FOR MICROPHONE ARRAY-BASED SPEECH RECOGNITION, , and , in: Proceedings of the IEEE Workshop on Spoken Language Technology, 2012 |
|
Proceedings of Interspeech (2012)
Comparing different acoustic modeling techniques for multilingual boosting, , , , and , in: Proceedings of Interspeech, Portland, Oregon, 2012 |
|
Proceedings International Conference on Multimodal Interfaces (ICMI-MLMI) (2012)
FaceTube: predicting personality from facial expressions of emotion in online conversational video, , and , in: Proceedings International Conference on Multimodal Interfaces (ICMI-MLMI), 2012 |
|
Proceedings on IEEE International Conference on Acoustics, Speech and Signal Processing (2012)
IMPROVING ACOUSTIC BASED KEYWORD SPOTTING USING LVCSR LATTICES, , and , in: Proceedings on IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Japan, pages 4413-4416, 2012 |
Proceedings of the 2012 IEEE Workshop on Spoken Language Technology (2012)
MediaParl: Bilingual mixed language accented speech database, , , , , and , in: Proceedings of the 2012 IEEE Workshop on Spoken Language Technology, pages 263--268, 2012 |
|
IEEE 7th Sensor Array and Multichannel Signal Processing Workshop(SAM) (2012)
Microphone Array Beampattern Characterization for Hands-free Speech Applications, , and , in: IEEE 7th Sensor Array and Multichannel Signal Processing Workshop(SAM), Hoboken, NJ, USA, pages 473-476, 2012 |
|
Proceedings of International Conference on Multimodal Interaction, ICMI 2012, Santa Monica, CA (2012)
Modeling dominance effects on nonverbal behaviors using granger causality, , , , , and , in: Proceedings of International Conference on Multimodal Interaction, ICMI 2012, Santa Monica, CA, 2012 |
|
Proceedings of AAAI International Conference on Weblogs and Social Media (2012)
The Good, the Bad, and the Angry: Analyzing Crowdsourced Impressions of Vloggers, and , in: Proceedings of AAAI International Conference on Weblogs and Social Media, 2012 |
|
RecSys, Recommendation Utility Evaluation (RUE 2012) (2012)
Using Crowdsourcing to Compare Document Recommendation Strategies for Conversations, and , in: RecSys, Recommendation Utility Evaluation (RUE 2012), Dublin, Ireland, pages 15-20, 2012 |
|
Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (2012)
Using KL-divergence and multilingual information to improve ASR for under-resourced languages, , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, pages 4869--4872, 2012 |
|
Proceedings of Interspeech (2012)
Using Sparse Classification Outputs as Feature Observations for Noise Robust ASR, , , , , and , in: Proceedings of Interspeech, 2012 |
|
SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session (2011)
A Just-in-Time Document Retrieval System for Dialogues or Monologues, , , and , in: SIGDIAL 2011 (12th annual SIGDIAL Meeting on Discourse and Dialogue), Demonstration Session, Portland, OR, pages 350-352, 2011 |
|
Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics) (2011)
A Speech-based Just-in-Time Retrieval System using Semantic Search, , , and , in: Proceedings of the ACL-HLT 2011 System Demonstrations (49th Annual Meeting of the Association for Computational Linguistics), Portland, OR, pages 80-86, 2011 |
[URL] |
The Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays (2011)
An Integrated Framework for Multi-Channel Multi-Source Localization and Voice Activity Detection, , , , and , in: The Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2011 |
|
Proceedings of Interspeech 2011 (2011)
Analysis and Comparison of Recent MLP Features for LVCSR Systems, , and , in: Proceedings of Interspeech 2011, 2011 |
|
Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding (2011)
Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition, , and , in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Hawaii, USA, pages 348-353, 2011 |
|
IAPR IEEE International Joint Conference on Biometrics (2011)
Fast Speaker Verification on Mobile Phone data using Boosted Slice Classifiers, , and , in: IAPR IEEE International Joint Conference on Biometrics, Washington DC, 2011 |
|
Proceedings of Interspeech (2011)
Hierarchical Tandem Features for ASR in Mandarin, , and , in: Proceedings of Interspeech, 2011 |
Improving non-native ASR through stochastic multilingual phoneme space transformations, , , , and , in: Proceedings of Interspeech, Florence, Italy, pages 537-540, 2011 |
|
Interspeech (2011)
Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings, and , in: Interspeech, Florence, Italy, pages 953-956, 2011 |
|
Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (2011)
Language dependent universal phoneme posterior estimation for mixed language speech recognition, , , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Prag, CZ, pages 5012-5015, 2011 |
|
Proceedings of Interspeech (2011)
Language-Independent Socio-Emotional Role Recognition in the AMI Meetings Corpus, and , in: Proceedings of Interspeech, 2011 |
|
Proceedings of International Conference on Acoustics, Speech and Signal Processing (2011)
MULTISTREAM SPEAKER DIARIZATION THROUGH INFORMATION BOTTLENECK SYSTEM OUTPUTS COMBINATION, , and , in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2011 |
|
IEEE Intl. Conference on Acoustics, Speech and Signal Processing 2011 (2011)
Phoneme Recognition using Boosted Binary Features, , and , in: IEEE Intl. Conference on Acoustics, Speech and Signal Processing 2011, 2011 |
|
Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011)
Speaker Diarization of Meetings based on Speaker Role N-gram Models, , and , in: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011 |
|
IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)
The Kaldi Speech Recognition Toolkit, , , , , , , , , , , , and , in: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village, Big Island, Hawaii, US, IEEE Signal Processing Society, 2011 |
|
Graph-based Methods for Natural Language Processing (2011)
Using a Wikipedia-based Semantic Relatedness Measure for Document Clustering., and , in: Graph-based Methods for Natural Language Processing, 2011 |
|
Proceedings of AAAI International Conference on Weblogs and Social Media (2011)
You Are Known by How You Vlog: Personality Impressions and Nonverbal Behavior in YouTube, , and , in: Proceedings of AAAI International Conference on Weblogs and Social Media, Barcelona, 2011 |
|
Proceedings of Interspeech, Japan (2010)
A Comparative Study of MLP Front-ends for Mandarin ASR, , , , and , in: Proceedings of Interspeech, Japan, 2010 |
|
LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010 (2010)
A Multimodal Corpus for Studying Dominance in Small Group Conversations, , and , in: LREC workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, May 2010, 2010 |
|
Proceedings of the 4th IEEE International Conference on Semantic Computing (ICSC 2010 ), Carnegie Mellon University, Pittsburgh, PA, USA (2010)
A Random Walk Framework to Compute Textual Semantic Similarity: a Unified Model for Three Benchmark Tasks, and , in: Proceedings of the 4th IEEE International Conference on Semantic Computing (ICSC 2010 ), Carnegie Mellon University, Pittsburgh, PA, USA, 2010 |
|
Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (2010)
An Adaptive Initialization Method for Speaker Diarization based on Prosodic Features, and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, pages 4946-4949, 2010 |
|
An Alternative Scanning Strategy to Detect Faces, and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010 |
|
2010 IEEE International Conference on Acoustics, Speech and Signal Processing (2010)
Application of Out-Of-Language Detection To Spoken-Term Detection, and , in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010 |
|
Proceedings of the 33rd Annual ACM SIGIR Conference (2010)
Automatic Content Linking: Speech-based Just-in-time Retrieval for Multimedia Archives, , , , , and , in: Proceedings of the 33rd Annual ACM SIGIR Conference, Geneva, Switzerland, pages 703, 2010 |
[DOI] |
2010 IEEE International Conference on Acoustics, Speech and Signal Processing (2010)
BOOSTED BINARY FEATURES FOR NOISE-ROBUST SPEAKER VERIFICATION, , and , in: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, Texas, 2010 |
|
Proceedings of Interspeech, Makuhari, Japan, 2010 (2010)
English Spoken Term Detection in Multilingual Recordings, , and , in: Proceedings of Interspeech, Makuhari, Japan, 2010, ISCA, Makuhari, Japan, 2010 |
|
ECCV, Workshop on Face Detection: Where we are, and what next? (2010)
Fast Bounding Box Estimation based Face Detection, and , in: ECCV, Workshop on Face Detection: Where we are, and what next?, 2010 |
[URL] |
International Conference on Speech and Language Processing, Interspeech (2010)
Floor Holder Detection and End of Speaker Turn Prediction in Meetings, , and , in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, ISCA, 2010 |
|
Proceedings of Interspeech (2010)
Hierarchical Multilayer Perceptron based Language Identification, , and , in: Proceedings of Interspeech, Makuhari, Japan, pages 2722-2725, 2010 |
|
IEEE Fourth International Conference on Biometrics: Theory, Applications and Systems (2010)
Introducing Crossmodal Biometrics:Person Identification from Distinct Audio & Visual Streams, and , in: IEEE Fourth International Conference on Biometrics: Theory, Applications and Systems, 2010 |
|
Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (2010)
Leveraging speaker diarization for meeting recognition from distant microphones, , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4390--4393, 2010 |
International Conference on Acoustics, Speech, and Signal Processing (2010)
Multistream Speaker Diarization beyond Two Acoustic Feature Streams, , and , in: International Conference on Acoustics, Speech, and Signal Processing, 2010 |
|
Proceedings of International Conference on Mobile and Ubiquitous Multimedia, Limassol, Cyprus (2010)
Recognizing conversational context in group interaction using privacy-sensitive mobile sensors, , , and , in: Proceedings of International Conference on Mobile and Ubiquitous Multimedia, Limassol, Cyprus, 2010 |
|
Proceedings of Measuring Behavior 2010, Eindhoven (The Netherlands) (2010)
Social Signal Processing: Understanding Nonverbal Communication in Social Interactions, and , in: Proceedings of Measuring Behavior 2010, Eindhoven (The Netherlands), 2010 |
|
Proceedings of Interspeech (2010)
Sparse Component Analysis for Speech Recognition in Multi-Speaker Environment, , and , in: Proceedings of Interspeech, Makuhari, Japan, 2010 |
|
ACM Multimedia Workshop on Searching Spontaneous Conversational Speech (2010)
The ACLD: Speech-based Just-in-Time Retrieval of Meeting Transcripts, Documents and Websites, , , and , in: ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Florence, Italy, 2010 |
|
7th International Conference on Language Resources and Evaluation (2010)
Towards a standard for dialogue act annotation, , , , , , , , , , , and , in: 7th International Conference on Language Resources and Evaluation, Malta, 2010 |
[URL] |
Proceedings of Interspeech (2010)
Towards mixed language speech recognition systems, , and , in: Proceedings of Interspeech, Makuhari, Japan, pages 278-281, 2010 |
|
Tracter: A Lightweight Dataflow Framework, and , in: Proceedings of Interspeech, Makuhari, Japan, 2010 |
|
International Conference on Acoustics, Speech and Signal Processing (2010)
Using Audio and Visual Cues for Speaker Diarisation Initialisation, and , in: International Conference on Acoustics, Speech and Signal Processing, 2010 |
|
Proceedings of ICASSP (2010)
VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS, , and , in: Proceedings of ICASSP, 2010 |
|
Proc. Int. Conf. on Computer Vision Theory and Applications (2010)
View-Based Appearance Model Online Learning for 3D Deformable Face Tracking, and , in: Proc. Int. Conf. on Computer Vision Theory and Applications, Angers, 2010 |
|
Proceedings International Conference on Multimodal Interfaces (ICMI-MLMI) (2010)
Vlogcast Yourself: Nonverbal Behavior and Attention in Social Media, and , in: Proceedings International Conference on Multimodal Interfaces (ICMI-MLMI), 2010 |
|
Proceedings of AAAI International Conference on Weblogs and Social Media, Washington DC (2010)
Voices of Vlogging, and , in: Proceedings of AAAI International Conference on Weblogs and Social Media, Washington DC, 2010 |
|
International Conference on Speech and Language Processing, Interspeech (2010)
Audio–Visual Synchronisation for Speaker Diarisation, , and , in: International Conference on Speech and Language Processing, Interspeech, Makuhari, Japan, 2010 |
|
Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction) (2009)
A Multimedia Retrieval System Using Speech Input, , , , , , , , , , and , in: Proceedings of ICMI-MLMI 2009 (11th International Conference on Multimodal Interfaces and 6th Workshop on Machine Learning for Multimodal Interaction), Cambridge, MA, 2009 |
|
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09. (2009)
APPLICATIONS OF SIGNAL ANALYSIS USING AUTOREGRESSIVE MODELS FOR AMPLITUDE MODULATION, , , and , in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009, WASPAA '09., IEEE, Mohonk Mountain House, New Paltz, New York, USA, 2009 |
[URL] |
ACM International Conference on Multimedia (2009)
Automatic Role Recognition in Multiparty Recordings Using Social Networks and Probabilistic Sequential Models, , and , in: ACM International Conference on Multimedia, 2009 |
|
10th Annual Conference of the International Speech Communication Association (2009)
Automatic vs. human question answering over multimedia meeting recordings, and , in: 10th Annual Conference of the International Speech Communication Association, 2009 |
|
Proceedings ICME 2009 (2009)
Characterising Conversationsal Group Dynamics Using Nonverbal Behaviour, , and , in: Proceedings ICME 2009, 2009 |
|
Proceedings ICMI-MLMI (2009)
Discovering Group Nonverbal Conversational Patterns with Topics, and , in: Proceedings ICMI-MLMI, 2009 |
|
Proceedings of IEEE Conference on Multimedia and Expo (2009)
Implicit Human Centered Tagging, , and , in: Proceedings of IEEE Conference on Multimedia and Expo, 2009 |
|
Proceedings of Interspeech 2009 (2009)
Investigating Privacy-Sensitive Features for Speech Detection in Multiparty Conversations, , , and , in: Proceedings of Interspeech 2009, 2009 |
|
Proceedings of the ACM International Conference on Multimedia (2009)
Investigating the use of Visual Focus of Attention for Audio-Visual Speaker Diarisation, , , and , in: Proceedings of the ACM International Conference on Multimedia, Beijing, China, 2009 |
|
Proceedings of the IEEE International Conference on Computer Vision (2009)
Joint Pose Estimator and Feature Learning for Object Detection, , , and , in: Proceedings of the IEEE International Conference on Computer Vision, 2009 |
10th Annual Conference of the International Speech Communication Association (2009)
KL Realignment for Speaker Diarization with Multiple Feature Streams, , and , in: 10th Annual Conference of the International Speech Communication Association, 2009 |
IEEE Int. Conference on Image Processing, Cairo, Egypt (2009)
Learning Large Margin Likelihood for Realtime Head Pose Tracking, and , in: IEEE Int. Conference on Image Processing, Cairo, Egypt, IEEE, 2009 |
|
Proceedings of the IEEE international conference on Computer Vision and Pattern Recognition (2009)
Learning Rotational Features for Filament Detection, , and , in: Proceedings of the IEEE international conference on Computer Vision and Pattern Recognition, 2009 |
Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding (2009)
MLP Based Hierarchical System for Task Adaptation in ASR, , and , in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, 2009 |
|
International Conference on Audio, Speech and Signal Processing (2009)
MULTI-MODAL SPEAKER DIARIZATION OF REAL-WORLD MEETINGS USING COMPRESSED-DOMAIN VIDEO FEATURES, , and , in: International Conference on Audio, Speech and Signal Processing, 2009 |
|
Proceedings of International Conference on Acoustics, Speech and Signal Processing (2009)
MUTUAL INFORMATION BASED CHANNEL SELECTION FOR SPEAKER DIARIZATION OF MEETINGS DATA, , and , in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009 |
|
Proceedings of International conference on acoustics speech and signal processing (2009)
Mutual Information based Channel Selection for Speaker Diarization of Meetings Data, , and , in: Proceedings of International conference on acoustics speech and signal processing, 2009 |
Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2009)
Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
|
Posterior features applied to speech recognition tasks with user-defined vocabulary, , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
|
Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing, (2009)
Predicting Remote Versus Collocated Group Interactions using Nonverbal Cues, , and , in: Proc. Int. Conf. on Multimodal Interfaces, Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing,, Cambridge, 2009 |
[DOI] |
Proceedings of Interspeech (2009)
Real-Time ASR from Meetings, , , , , , , , and , in: Proceedings of Interspeech, Brighton, UK., 2009 |
|
Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding (2009)
Robust Speaker Diarization for Short Speech Recordings, and , in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, pages 432-437, 2009 |
|
SNR Features for Automatic Speech Recognition, , in: Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano, Italy, 2009 |
|
Proceedings of ICMI-MLMI 2009 (2009)
Speaker Change Detection with Privacy-Preserving Audio Cues, , , and , in: Proceedings of ICMI-MLMI 2009, 2009 |
|
Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention (2009)
Steerable Features for Statistical 3D Dendrite Detection, , , , and , in: Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention, 2009 |
IEEE Proc. Int. Conf. on Multimedia and Expo (2009)
Structure and appearance features for robust 3D facial actions tracking, and , in: IEEE Proc. Int. Conf. on Multimedia and Expo, IEEE, 2009 |
|
International Conference on Multimedia & Expo (2009)
Visual Activity Context For Focus of Attention Estimation in Dynamic Meetings, , and , in: International Conference on Multimedia & Expo, 2009 |
|
ACM Multimedia (2009)
Visual Speaker Localization Aided by Acoustic Models, , and , in: ACM Multimedia, 2009 |
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2009)
Volterra Series for Analyzing MLP based Phoneme Posterior Probability Estimator, , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2009 |
|
Proceedings of the 17th ACM International Conference on Multimedia (2009)
Wearing a YouTube hat: directors, comedians, gurus, and user aggregated behavior, and , in: Proceedings of the 17th ACM International Conference on Multimedia, ACM, 2009 |
|
Proc. of the Intl. Conf. on Image and Video Retrieval (2008)
Analyzing Flickr Groups, and , in: Proc. of the Intl. Conf. on Image and Video Retrieval, ACM, 2008 |
proceedings of the European Conference on Computer Vision (2008)
Automated Delineation of Dendritic Networks in Noisy Image Stacks, , and , in: proceedings of the European Conference on Computer Vision, 2008 |
International Conference on Automatic Face and Gesture Recognition (2008)
Identifying Dominant People in Meetings from Audio-Visual Sensors, and , in: International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, 2008 |
|
International Conference on Multi-modal Interfaces (2008)
Investigating Automatic Dominance Estimation in Groups From Visual Attention and Speaking Activity, , , , and , in: International Conference on Multi-modal Interfaces, 2008 |
|
proceedings of the European Conference on Computer Vision (2008)
Multi-Camera Tracking and Atypical Motion Detection with Behavioral Maps, , and , in: proceedings of the European Conference on Computer Vision, 2008 |
proceedings of the International Conference on Computer Vision Theory and Applications (2008)
Principled Detection-by-classification from Multiple Views, , and , in: proceedings of the International Conference on Computer Vision Theory and Applications, 2008 |
LREC 2008 ELRA Workshop on Evaluation (2008)
Reference-based vs. task-based evaluation of human language technology, , in: LREC 2008 ELRA Workshop on Evaluation, ELRA, Marrakech, Morocco, 2008 |
|
ACM International Conference on Multimedia (2008)
Role Recognition for Meeting Participants: an Approach Based on Lexical Information and Social Network Analysis, , , , and , in: ACM International Conference on Multimedia, Vancouver, Canada, 2008 |
|
International Conference on Multimodal Interfaces (2008)
Role Recognition in Multiparty Recordings using Social Affiliation Networks and Discrete Distributions, , , and , in: International Conference on Multimodal Interfaces, Chania, Greece, 2008 |
|
Proceedings of the ACM International Conference on Multimedia (2008)
Social Signal Processing: State-of-the-Art and Future Perspectives of an Emerging Domain, , , and , in: Proceedings of the ACM International Conference on Multimedia, 2008 |
|
Proceedings of International Conference on Multimodal Interfaces (to appear) (2008)
Social Signals, their Function, and Automatic Analysis: A Survey, , , and , in: Proceedings of International Conference on Multimodal Interfaces (to appear), 2008 |
|
6th International Conference on Language Resources and Evaluation (2008)
Task-based evaluation of meeting browsers: from BET task elicitation to user behavior analysis, , , and , in: 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 2008 |
|
MM '08: Proc. of the 16th ACM Intl. Conf. on Multimedia (2008)
Topickr: Flickr Groups and Users Reloaded, and , in: MM '08: Proc. of the 16th ACM Intl. Conf. on Multimedia, ACM, 2008 |
European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion (2008)
Towards Audio-Visual On-line Diarization Of Participants In Group Meetings, and , in: European Conference on Computer Vision Workshop on Multi-camera and Multi-modal Sensor Fusion, 2008 |
|
IEEE International Conference on Acoustics, Speech, and Signal Processing (2007)
ESTIMATING THE DOMINANT PERSON IN MULTI-PARTY CONVERSATIONS USING SPEAKER DIARIZATION STRATEGIES, , , and , in: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2007 |
Publications of type Phdthesis
2015
Enabling speech applications using Ad-Hoc Microphone Arrays, , École Polytechnique Fédérale de Lausanne, 2015 |
|
2013
Mining Conversational Social Video, , EPFL, 2013 |
|
Multilingual speech recognition A posterior based approach, , École Polytechnique Fédérale de Lausanne (EPFL), 2013 |
|
2012
Alternative search techniques for face detection using location estimation and binary features, , ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, 2012 |
|
2011
Boosting Localized Features for Speaker and Speech Recognition, , Ecole Polytechnique Federale de Lausanne (EPFL), 2011 |
|
2010
An Information Theoretic Approach to Speaker Diarization of Meeting Recordings, , Ecole polytechnique fédérale de Lausanne, 2010 |
|
2008
Methods for Asynchronous and Non-Invasive EEG-Based Brain-Computer Interfaces. Towards Intelligent Brain-Actuated Wheelchairs, , University of Barcelona, 2008 |
|