Keywords:
- audio-visual speaker recognition
- deep learning
- Deep Metric Learning.
- deep neural networks
- diarization
- domain adaptation
- embedding
- embedding learning
- Face
- face clustering
- Face dirarization
- Face Recognition
- Metric learning
- multi-object tracking
- Multimodal
- multimodal identification
- Multimodal person diarization
- OCR
- person diarization
- person discovery
- person identification
- speaker
- Speaker Diarization
- Speaker identification
- speaker verification
- tracking
- transfer learning
- triplet loss
Publications of Nam Le sorted by journal and type
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition, , , , , , and , in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016 |
|
Multimedia Tools and Applications
Improving speech embedding using crossmodal transfer learning with audio-visual data, and , in: Multimedia Tools and Applications, 78(11):15681-15704, 2019 |
[DOI] |
Proceedings of Interspeech (2018)
Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization, and , in: Proceedings of Interspeech, Hyderabad, INDIA, pages 2257-2261, 2018 |
[DOI] |
ACM International Conference on Multimodal Interaction (2017)
A Domain Adaptation Approach to Improve Speaker Turn Embedding Using Face Representation, and , in: ACM International Conference on Multimodal Interaction, Glasgow, Scotland, ACM, 2017 |
|
ICCV Workshop on Computer Vision for Audio-Visual Media (2017)
Improving speaker turn embedding by crossmodal transfer learning from face embedding, and , in: ICCV Workshop on Computer Vision for Audio-Visual Media, 2017 |
|
15th International Workshop on Content-Based Multimedia Indexing (2017)
Towards large scale multimedia indexing: A case study on person discovery in broadcast news, , and , in: 15th International Workshop on Content-Based Multimedia Indexing, 2017 |
|
MediaEval Benchmarking Initiative for Multimedia Evaluation (2016)
EUMSSI team at the MediaEval Person Discovery Challenge 2016, , and , in: MediaEval Benchmarking Initiative for Multimedia Evaluation, Hilversum, Netherlands, 2016 |
|
ACM Multimedia (2016)
Learning Multimodal Temporal Representation for Dubbing Detection in Broadcast Media, and , in: ACM Multimedia, Amsterdam, ACM, 2016 |
|
2nd Workshop on Benchmarking Multi-target Tracking: MOTChallenge 2016 (2016)
Long-Term Time-Sensitive Costs for CRF-Based Tracking by Detection, , and , in: 2nd Workshop on Benchmarking Multi-target Tracking: MOTChallenge 2016, Amsterdam, 2016 |
|
International Conference on Pattern Recognition (2016)
Temporally Subsampled Detection for Accurate and Efficient Face Tracking and Diarization, , , and , in: International Conference on Pattern Recognition, Cancun, Mexico, IEEE, 2016 |
|
Working Notes Proceedings of the MediaEval 2015 Workshop (2015)
EUMSSI team at the MediaEval Person Discovery Challenge, , , and , in: Working Notes Proceedings of the MediaEval 2015 Workshop, Wurzen, Germany, 2015 |
[URL] |
Publications of type Phdthesis
2019
Multimodal Person Recognition in Audio-Visual Streams, , EPFL, 2019 |
[DOI] |