All publications sorted by recency
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 |
Robust Microphone Placement for Source Localization from Noisy Distance Measurements, , , , and , in: IEEE 40th International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2579-2583, IEEE, 2015 |
[DOI] |
Building context-dependent DNN acoustic models using Kullback-Leibler divergence-based state tying, , , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, 2015 |
|
Discourse-level Features for Statistical Machine Translation, , École Polytechnique Fédérale de Lausanne (EPFL), 2014 |
|
What Your Face Vlogs About: Expressions of Emotion and Big-Five Traits Impressions in YouTube, , , and , in: IEEE Transactions Affective Computing, 2014 |
|
The Workshop on Computational Personality Recognition 2014, , , , , and , in: Proceedings of the ACM International Conference on Multimedia, 2014 |
|
Mining Crowdsourced First Impressions in Online Social Video, and , in: IEEE Transactions on Multimedia, 16(7), 2014 |
|
Preliminary Work on Speaker Adaptation for DNN-Based Speech Synthesis, , and , Idiap-RR-02-2015 |
|
Automatic Blinking Detection towards Stress Discovery, , , and , in: Proc. ACM Int. Conf. on Multimodal Interaction, Istanbul, pages 307-310, ACM New York, 2014 |
[DOI] |
Capturing Upper Body Motion in Conversation: an Appearance Quasi-Invariant Approach, , , and , in: Proc. ACM Int. Conf. on Multimodal Interaction, Istanbul, pages 327-334, ACM New York, 2014 |
[DOI] |
Signal Processing in the Workplace, , in: IEEE Signal Processing Magazine, 32(1):121-125, 2015 |
|
Leveraging Colour Segmentation for Upper-Body Detection, and , in: Pattern Recognition, 47(6):2222-2230, 2014 |
|
Detecting and Labeling Speakers on Overlapping Speech using Vector Taylor Series, , and , in: INTERSPEECH, 2014 |
|
Multi-source Posteriors for Speech Activity Detection on Public Talks, and , in: INTERSPEECH, 2014 |
|
Diarizing Large Corpora using Multi-modal Speaker Linking, , , and , in: INTERSPEECH 2014, 2014 |
|
LETHA: Learning from High Quality Inputs for 3D Pose Estimation in Low Quality Images, , , and , in: Proceedings of the International Conference on 3D vision, pages 517–524, 2014 |
Tracking Interacting Objects Optimally Using Integer Programming, , , and , in: Proceedings of the European Conference on Computer Vision, pages 17-32, 2014 |
|
Phoneme Background Model for Information Bottleneck based Speaker Diarization, , and , in: Interspeech, Singapore, 2014 |
|
Artificial neural network features for speaker diarization, , and , in: IEEE Spoken Language Technology workshop, South Lake Tahoe, USA, 2014 |
|
Overlapping speech detection using long-term conversational features for speaker diarization in meeting room conversations., and , in: Audio, Speech and Language processing, IEEE/ACM Transaction on, 22(12):1688-1700, 2014 |
|
Anti-Spoofing: Face Databases, , and , in: Encyclopedia of Biometrics, Springer US, 2014 |
[DOI] [URL] |
Anti-spoofing: Evaluation Methodologies, , and , in: Encyclopedia of Biometrics, Springer US, 2014 |
[DOI] |
Evaluation Databases, , , and , in: Handbook of Biometric Anti-Spoofing, pages 247-278, Springer-Verlag, 2014 |
[DOI] |
Face Anti-spoofing: Visual Approach, , , , and , in: Handbook of Biometric Anti-Spoofing, pages 65-82, Springer-Verlag, 2014 |
[DOI] |
LETHA: Learning from High Quality Inputs for 3D Pose Estimation in Low Quality Images., , , and , Idiap-RR-22-2014 |
|
EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION, , , and , Idiap-RR-16-2015 |
|
Development of Bilingual ASR System for MediaParl Corpus, , , and , in: Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), Singapore, ISCA, 2014 |
|
Development of Bilingual ASR System for MediaParl Corpus, , , and , Idiap-RR-21-2014 |
|
Sample Distillation for Object Detection and Image Classification, , and , in: Proceedings of the 6th Asian Conference on Machine Learning (ACML), Nha Trang, Vietnam, 2014 |
|
Efficient Sample Mining for Object Detection, and , in: Proceedings of the 6th Asian Conference on Machine Learning (ACML), Nha Trang, Vietnam, 2014 |
|
Keyword Extraction and Clustering for Document Recommendation in Conversations, and , in: IEEE/ACM Transactions on Audio Speech and Language Processing, 23(4):746 - 759, 2015 |
[DOI] |
Otomatik İşaret Dili Tanıma ve Türk İşaret Dili için Bilgisayar Uygulamaları, , , , and , in: Ellerle Konusmak: Turk Isaret Dili Arastirmalari / Research on Turkish Sign Language, pages 471-498, Koc University Press, 2016 |
Modeling Annotator Behaviors for Crowd Labeling, , , and , in: Neurocomputing, 160:141–156, 2015 |
[DOI] |
Discourse connectives: theoretical models and empirical validations in humans and computers, and , in: Papers dedicated to Jacques Moeschler, University of Geneva, 2014 |
[URL] |
ROCKIT: Roadmap for Conversational Interaction Technologies, , , , , , , , , , , , , , and , in: Proceedings of the 2014 Workshop on Roadmapping the Future of Multimodal Interaction Research including Business Opportunities and Challenges (RFMIR '14), pages 39-42, ACM, 2014 |
[DOI] |
Syllabic Pitch Tuning for Neutral-to-Emotional Voice Conversion, , and , Idiap-RR-31-2015 |
|
Transfer Learning through Greedy Subset Selection, , and , Idiap-RR-26-2015 |
|
Incremental Syllable-Context Phonetic Vocoding, , , , and , Idiap-RR-05-2015 |
|
Phonological vocoding using artificial neural networks, , and , Idiap-RR-04-2015 |
|
Acoustic and Lexical Resource Constrained ASR using Language-Independent Acoustic Model and Language-Dependent Probabilistic Lexical Model, and , in: Speech Communication, 68:23–40, 2015 |
[DOI] [URL] |
Impact of Eye Detection Error on Face Recognition Performance, , , , , and , in: IET Biometrics, 2015 |
[URL] |
A Skill Transfer Approach for Continuum Robots - Imitation of Octopus Reaching Motion with the STIFF-FLOP Robot, , , and , in: In Proc. of the AAAI Symp. on Knowledge, Skill, and Behavior Transfer in Autonomous Robots, Arlington, VA, USA, pages 49-52, 2014 |
[URL] |
Skills Learning in Robots by Interaction with Users and Environment, , in: In Proc. of the Intl Conf. on Ubiquitous Robots and Ambient Intelligence (URAI), Kuala Lumpur, Malaysia, pages 161-162, 2014 |
[URL] |
COMBINING SGMM SPEAKER VECTORS AND KL-HMM APPROACH FOR SPEAKER DIARIZATION, , and , Idiap-RR-17-2015 |
|
KL-HMM BASED SPEAKER DIARIZATION SYSTEM FOR MEETINGS, and , Idiap-RR-19-2015 |
|
Articulatory Feature based Continuous Speech Recognition using Probabilistic Lexical Modeling, and , Idiap-RR-19-2014 |
|
Who Will Get the Grant ? A Multimodal Corpus for the Analysis of Conversational Behaviours in Group Interviews, , , , and , in: International Conference on Multimodal Interaction, Understanding and Modeling Multiparty, Multimodal Interactions Workshop, Istanbul, Turkey, ACM, 2014 |
[DOI] |
Joint Phoneme Segmentation Inference and Classification using CRFs, , and , in: Global Conference on Signal and Information Processing, Atlanta, GA, pages 587 - 591, IEEE, 2014 |
[DOI] |
Convolutional Neural Networks-based Continuous Speech Recognition using Raw Speech Signal, , and , Idiap-RR-18-2014 |
|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 |