Keywords:
- 3D face model
- abnormality detection
- acoustic generators
- Acoustic signal processing
- activity
- appearance based methods
- Appearance based model
- appearance model
- Archaeology
- Artificial Neural Networks
- attention
- audio-visual speaker recognition
- Autism
- backchannels
- Bayesian modeling
- behavior analysis.
- bias correction
- blink
- bobbing estimation
- camera network
- Children
- clustering
- cognition
- computer vision
- Content-based multimedia indexing
- conversation
- convolutional network
- Convolutional neural network.
- Convolutional Neural Networks
- corpus
- covariance matrices
- Crowdsourcing
- Cultural heritage
- dataset
- deep learning
- Deep Metric Learning.
- deep neural networks
- Delays
- diarization
- Dimensionality reduction
- direction-of-arrival estimation
- DOA estimation
- domain adaptation
- embedding
- embedding learning
- Encoding
- entrainment to music
- Epigraphy
- Estimation
- eye movements
- eye tracking
- eye-gaze
- Face
- face clustering
- Face dirarization
- Face Recognition
- Face tracking
- Facial animations
- Feature extraction
- Feature-based tracking
- first impressions
- focus of attention
- gait
- Gaze
- Gaze Coding
- gaze detection
- Gaze estimation
- generative models
- geometric method
- grapevine pruning
- group dynamics
- HCI
- head nods
- Head pose
- Head pose tracking
- head-pose invariance
- HHI
- hieroglyph
- Histogram of orientation
- HOOSC
- HRI
- human activity recognition
- human behaviour analysis
- human detection
- Human pose estimation
- human-robot interaction
- image rectification
- image retrieval
- image segmentation
- indexing
- information fusion
- information visualization
- internet of things
- involvement
- keyframe extraction
- language
- learning
- likelihood-based encoding
- listener categories
- machine learning
- manipulation
- Maya civilization
- Maya culture
- Maya glyph
- maya glyphs
- Metric learning
- microphone arrays
- Microphones
- mixed activity
- Monte Carlo methods
- motif mining
- multi-camera
- multi-object tracking
- Multimodal
- multimodal identification
- Multimodal interaction
- Multimodal person diarization
- multiple face tracking
- multiple sound sources
- multiple speaker detection
- multivariate time series
- network output
- neural nets
- neural network-based sound source localization methods
- neural networks
- non parametric models
- non-verbal cues
- Nonverbal behavior
- OCR
- online calibration.
- particle filter
- person diarization
- person discovery
- person identification
- person invariance
- Person Tracking
- plant skeleton
- pLSA
- Position measurement
- precision viticulture
- real-time
- remote
- remote recording
- remote sensing
- remote sensor
- representation learning
- RGB-D
- RGB-D camera
- RGB-D cameras
- road vehicles
- Robots
- saccade
- Sampling
- scene analysis
- segmentation
- shape classification
- Shape descriptor
- shape recognition
- Shape retrieval
- shot boundary detection
- simultaneous detection
- single sound source
- sketch
- skin colour
- social computing
- sound mixtures
- sound source localization
- sparse autoencoder
- sparse coding
- spatial spectrum-based approaches
- speaker
- Speaker Diarization
- Speaker identification
- speaker recognition
- speaker verification
- spectral shot clustering
- Speech
- surveillance
- topic models
- tracking
- training
- transfer learning
- triplet loss
- unsupervised
- Unsupervised · Latent sequential patterns · Topic models · PLSA · Video surveillance · Activity analysis
- unsupervised activity analysis
- unsupervised calibration
- unsupervised learning
- usability
- user study
- variational inference
- ve- hicle detection.
- VFOA
- video
- video processing
- video structuring
- vineyard
- virtual agents
- visual focus of attention
- Visual similarity
- weakly-supervised learning.
Publications of Jean-Marc Odobez
2020
Unsupervised Representation Learning for Gaze Estimation, and , in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 |
|
WatchNet++: Efficient and accurate depth-based network for detecting people attacks and intrusion, , , and , in: Machine Vision and Applications, 2020 |
|
2019
A Deep Learning Approach for Robust Head Pose Independent Eye Movements Recognition from Videos, , and , in: 2019 ACM Symposium on Eye Tracking Research and Applications, pages 5, ACM, 2019 |
[DOI] |
Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-Adversarial Training, , and , Idiap-Com-01-2019 |
|
Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-Adversarial Training, , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, United Kingdom, pages 770-774, 2019 |
[DOI] |
Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis, , and , Idiap-RR-03-2019 |
|
Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis, , and , in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019 |
Improving speech embedding using crossmodal transfer learning with audio-visual data, and , in: Multimedia Tools and Applications, 78(11):15681-15704, 2019 |
[DOI] |
2018
A Differential Approach for Gaze Estimation with Calibration, , , and , in: 29TH BRITISH MACHINE VISION CONFERENCE, 2018 |
|
Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model, , and , in: European Conference on Computer Vision Workshop, 2018 |
|
Deep Neural Networks for Multiple Speaker Detection and Localization, , and , Idiap-RR-02-2018 |
|
Deep Neural Networks for Multiple Speaker Detection and Localization, , and , in: 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, AUSTRALIA, pages 74-79, 2018 |
[DOI] |
Facing Employers and Customers: What Do Gaze and Expressions Tell About Soft Skills?, , , and , in: Proceedings of the 17th International Conference on Mobile and Ubiquitous Multimedia, Cairo, Egypt, pages 121-126, ASSOC COMPUTING MACHINERY, 2018 |
[DOI] |
HeadFusion: 360 degree Head Pose Tracking Combining 3D Morphable Model and 3D Reconstruction, , and , in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 40(11), 2018 |
[DOI] |
How to Tell Ancient Signs Apart? Recognizing and Visualizing Maya Glyphs with CNNs, , and , in: ACM Journal on Computing and Cultural Heritage (JOCCH), 11(4):20, 2018 |
[DOI] |
Investigating Depth Domain Adaptation for Efficient Human Pose Estimation, , , and , in: European Conference on Computer Vision - Workshops, 2018 |
|
Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network, , and , in: Proceedings of Interspeech, pages 312--316, 2018 |
[DOI] |
Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network, , and , Idiap-RR-17-2018 |
|
Leveraging Convolutional Pose Machines for Fast and Accurate Head Pose Estimation, , and , in: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, SPAIN, pages 1089-1094, IEEE, 2018 |
|
Maya Codical Glyph Segmentation: A Crowdsourcing Approach, , and , in: IEEE Transactions on Multimedia, 20(3):711-725, 2018 |
[DOI] [URL] |
Real-time Convolutional Networks for Depth-based Human Pose Estimation, , , and , in: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018 |
|
Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization, and , in: Proceedings of Interspeech, Hyderabad, INDIA, pages 2257-2261, 2018 |
[DOI] |
UNICITY: A depth maps database for people detection in security airlocks, , , , , , , and , in: IEEE International Conference on Advanced Video and Signal-based Surveillance Workshop, 2018 |
|
WatchNet: Efficient and Depth-based Network for People Detection in Video Surveillance Systems, , , and , in: IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, NEW ZEALAND, pages 109-114, 2018 |
|
2017
A Domain Adaptation Approach to Improve Speaker Turn Embedding Using Face Representation, and , in: ACM International Conference on Multimodal Interaction, Glasgow, Scotland, ACM, 2017 |
|
Active Online Anomaly Detection using Dirichlet Process Mixture Model and Gaussian Process Classification, , , , and , in: IEEE Winter Conference on Applications of Computer Vision (WACV), Washington, 2017 |
|
Analyzing and Visualizing Ancient Maya Hieroglyphics Using Shape: from Computer Vision to Digital Humanities, , , and , in: Digital Scholarship in the Humanities, 32:179-194, 2017 |
|
Extracting Maya Glyphs from Degraded Ancient Documents via Image Segmentation, , and , in: Journal on Computing and Cultural Heritage, 10, 2017 |
|
Improving speaker turn embedding by crossmodal transfer learning from face embedding, and , in: ICCV Workshop on Computer Vision for Audio-Visual Media, 2017 |
|
MAAYA: Multimedia Methods to Support Maya Epigraphic Analysis, , , , , , and , in: Arqueologia computacional: Nuevos enfoques para el analisis y la difusion del patrimonio cultural, INAH-RedTDPC, 2017 |
|
Maya Codical Glyph Segmentation: A Crowdsourcing Approach, , and , Idiap-RR-01-2017 |
|
Real-time Multiple Head Tracking Using Texture and Colour Cues, and , Idiap-RR-02-2017 |
|