Keywords:
- 3D face model
- abnormality detection
- acoustic generators
- Acoustic signal processing
- activity
- appearance based methods
- Appearance based model
- appearance model
- Archaeology
- Artificial Neural Networks
- attention
- audio-visual speaker recognition
- Autism
- backchannels
- Bayesian modeling
- behavior analysis.
- bias correction
- blink
- bobbing estimation
- camera network
- Children
- clustering
- cognition
- computer vision
- Content-based multimedia indexing
- conversation
- convolutional network
- Convolutional neural network.
- Convolutional Neural Networks
- corpus
- covariance matrices
- Crowdsourcing
- Cultural heritage
- dataset
- deep learning
- Deep Metric Learning.
- deep neural networks
- Delays
- diarization
- Dimensionality reduction
- direction-of-arrival estimation
- DOA estimation
- domain adaptation
- embedding
- embedding learning
- Encoding
- entrainment to music
- Epigraphy
- Estimation
- eye movements
- eye tracking
- eye-gaze
- Face
- face clustering
- Face dirarization
- Face Recognition
- Face tracking
- Facial animations
- Feature extraction
- Feature-based tracking
- first impressions
- focus of attention
- gait
- Gaze
- Gaze Coding
- gaze detection
- Gaze estimation
- generative models
- geometric method
- grapevine pruning
- group dynamics
- HCI
- head nods
- Head pose
- Head pose tracking
- head-pose invariance
- HHI
- hieroglyph
- Histogram of orientation
- HOOSC
- HRI
- human activity recognition
- human behaviour analysis
- human detection
- Human pose estimation
- human-robot interaction
- image rectification
- image retrieval
- image segmentation
- indexing
- information fusion
- information visualization
- internet of things
- involvement
- keyframe extraction
- language
- learning
- likelihood-based encoding
- listener categories
- machine learning
- manipulation
- Maya civilization
- Maya culture
- Maya glyph
- maya glyphs
- Metric learning
- microphone arrays
- Microphones
- mixed activity
- Monte Carlo methods
- motif mining
- multi-camera
- multi-object tracking
- Multimodal
- multimodal identification
- Multimodal interaction
- Multimodal person diarization
- multiple face tracking
- multiple sound sources
- multiple speaker detection
- multivariate time series
- network output
- neural nets
- neural network-based sound source localization methods
- neural networks
- non parametric models
- non-verbal cues
- Nonverbal behavior
- OCR
- online calibration.
- particle filter
- person diarization
- person discovery
- person identification
- person invariance
- Person Tracking
- plant skeleton
- pLSA
- Position measurement
- precision viticulture
- real-time
- remote
- remote recording
- remote sensing
- remote sensor
- representation learning
- RGB-D
- RGB-D camera
- RGB-D cameras
- road vehicles
- Robots
- saccade
- Sampling
- scene analysis
- segmentation
- shape classification
- Shape descriptor
- shape recognition
- Shape retrieval
- shot boundary detection
- simultaneous detection
- single sound source
- sketch
- skin colour
- social computing
- sound mixtures
- sound source localization
- sparse autoencoder
- sparse coding
- spatial spectrum-based approaches
- speaker
- Speaker Diarization
- Speaker identification
- speaker recognition
- speaker verification
- spectral shot clustering
- Speech
- surveillance
- topic models
- tracking
- training
- transfer learning
- triplet loss
- unsupervised
- Unsupervised · Latent sequential patterns · Topic models · PLSA · Video surveillance · Activity analysis
- unsupervised activity analysis
- unsupervised calibration
- unsupervised learning
- usability
- user study
- variational inference
- ve- hicle detection.
- VFOA
- video
- video processing
- video structuring
- vineyard
- virtual agents
- visual focus of attention
- Visual similarity
- weakly-supervised learning.
Publications of Jean-Marc Odobez sorted by title
A
Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-Adversarial Training, , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, United Kingdom, pages 770-774, 2019 |
[DOI] |
Algorithms for Video Structuring, , and , Idiap-Com-05-2002 |
|
An Attention Mechanism for Deep Q-Networks with Applications in Robotic Pushing, , and , Idiap-RR-03-2021 |
|
An Attention Mechanism for Deep Q-Networks with Applications in Robotic Pushing, , and , in: Proc. of Workshop on Emerging paradigms for robotic manipulation: from the lab to the productive world, ICRA, 2021 |
An Efficient Image-to-Image Translation HourGlass-based Architecture for Object Pushing Policy Learning, , and , in: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021 |
|
An Implicit Motion Likelihood for Tracking with Particle Filters, , and , in: British Machine Vision Conference (BMVC), Springer Verlag, 2003 |
|
An Implicit Motion Likelihood for Tracking with Particle Filters, , and , Idiap-RR-15-2003 |
|
Analyse non supervisée d'activités en vidéo surveillance pour l'analyse de scène et la détection d'événements anormaux, and , Idiap-RR-20-2013 |
[URL] |
Analyzing ancient Maya glyph collections with Contextual Shape Descriptors, , , and , in: International Journal of Computer Vision, 94(1):101-117, 2011 |
[DOI] |
Analyzing and Visualizing Ancient Maya Hieroglyphics Using Shape: from Computer Vision to Digital Humanities, , , and , in: Digital Scholarship in the Humanities, 32:179-194, 2017 |
|
Ancient Maya Writings as High-Dimensional Data: a Visualization Approach, , , and , in: Digital Humanities (DH), Krakow, 2016 |
|
Application of Information Retrieval Technologies to Presentation Slides, and , in: IEEE Transactions on Multimedia, 8(5), 2006 |
|
Application of Information Retrieval Technologies to Presentation Slides, and , Idiap-RR-36-2005 |
|
Assessing a Shape Descriptor for Analysis of Mesoamerican Hieroglyphics: A View Towards Practice in Digital Humanities, , and , in: Digital Humanities Conference (DH), Krakow, 2016 |
|
Assessing Scene Structuring in Consumer Videos, , , , and , in: Int. Conf. on Image and Video Retrieval (CIVR), 2004 |
|
Assessing Scene Structuring in Consumer Videos, , , , and , Idiap-RR-11-2004 |
|
Assessing Sparse Coding Methods for Contextual Shape Indexing of Maya Hieroglyphs, , and , in: Journal of Multimedia, 7(2):179--192, 2012 |
|
Audio-Video Person Clustering in Video Databases, and , Idiap-RR-46-2003 |
|
Audio-visual probabilistic tracking of multiple speakers in meetings, , , and , in: IEEE Trans. on Audio, Speech, and Language Processing, accepted for publication., 2006 |
|
Audio-visual probabilistic tracking of multiple speakers in meetings, , , and , Idiap-RR-27-2005 |
|
Audio-Visual Speaker Tracking with Importance Particle Filters, , , , and , in: IEEE International Conference on Image Processing (ICIP), 2003 |
|
Audio-Visual Speaker Tracking with Importance Particle Filters, , , , and , Idiap-RR-37-2002 |
|
Automated Bobbing and Phase Analysis to Measure Walking Entrainment, , , , , , and , in: IEEE International Conference on Image Processing (ICIP), Paris, 2014 |
|
Automatic Maya Hieroglyph Retrieval Using Shape and Context Information, , , , and , in: ACM MM, pages 4, 2014 |
[URL] |
AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking, , and , in: Proceedings of the 2004 MLMI Workshop, S. Bengio and H. Bourlard Eds, Springer Verlag, 2005 |
|
AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking, , and , Idiap-RR-28-2004 |
|
B
Bridging the Past, Present and Future: Modeling Scene Activities From Event Relationships and Global Rules, , and , in: IEEE Conference on Computer Vision and Pattern Recognition, 2012, Providence, Rhode Island, USA, 2012 |
C
CCDb-HG: Novel Annotations and Gaze-Aware Representations for Head Gesture Recognition, , , and , in: 18th IEEE Int. Conference on Automatic Face and Gesture Recognition (FG), Istanbul,, 2024 |
|
ChildPlay-Hand: A Dataset of Hand Manipulations in the Wild, , and , in: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2024 |
|
ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour, , and , in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023 |
|
Clustering flood events from water quality time-series using Latent Dirichlet Allocation model, , , , , , , , , and , in: Water Resources Research, 2013 |
[DOI] |
Combined Estimation of Location and Body Pose in Surveillance Video, , and , in: AVSS, 2011 |
|
Combining dynamic head pose-gaze mapping with the robot conversational state for attention recognition in human-robot interactions, and , in: Pattern Recognition Letters, 66:81-90, 2015 |
|
Comparison of Support Vector Machine and Neural Network for Text Texture Verification, and , Idiap-RR-19-2002 |
|
Comparison of Two Methods for Unsupervised Person Identification in TV Shows, , , , and , in: 12th International Workshop on Content-Based Multimedia Indexing, 2014 |
|
Constructing visual models with a latent space approach, , , and , in: the Springer series of Lecture Notes in Computer Science, 2006 |
|
Constructing visual models with a latent space approach, , , and , Idiap-RR-14-2005 |
|
Context Aware Addressee Estimation for Human Robot Interaction, , , and , in: Proceedings of the 6th Workshop on Eye Gaze in Intelligent Human Machine Interaction: Gaze in Multimodal Interaction, 2013 |
Contextual classification of image patches with latent aspect models, , , and , in: EURASIP Journal on Image and Video Processing, Special Issue on Patches in Vision, 2009 |
|
CRF-Based Context Modeling for Person Identification in Broadcast Videos, , , and , in: Frontiers in ICT: Computer Image Analysis, 3, 2016 |
|