Keywords:
- 3D face model
- abnormality detection
- acoustic generators
- Acoustic signal processing
- activity
- appearance based methods
- Appearance based model
- appearance model
- Archaeology
- Artificial Neural Networks
- attention
- audio-visual speaker recognition
- Autism
- backchannels
- Bayesian modeling
- behavior analysis.
- bias correction
- blink
- bobbing estimation
- camera network
- Children
- clustering
- cognition
- computer vision
- Content-based multimedia indexing
- conversation
- convolutional network
- Convolutional neural network.
- Convolutional Neural Networks
- corpus
- covariance matrices
- Crowdsourcing
- Cultural heritage
- dataset
- deep learning
- Deep Metric Learning.
- deep neural networks
- Delays
- diarization
- Dimensionality reduction
- direction-of-arrival estimation
- DOA estimation
- domain adaptation
- embedding
- embedding learning
- Encoding
- entrainment to music
- Epigraphy
- Estimation
- eye movements
- eye tracking
- eye-gaze
- Face
- face clustering
- Face dirarization
- Face Recognition
- Face tracking
- Facial animations
- Feature extraction
- Feature-based tracking
- first impressions
- focus of attention
- gait
- Gaze
- Gaze Coding
- gaze detection
- Gaze estimation
- generative models
- geometric method
- grapevine pruning
- group dynamics
- HCI
- head nods
- Head pose
- Head pose tracking
- head-pose invariance
- HHI
- hieroglyph
- Histogram of orientation
- HOOSC
- HRI
- human activity recognition
- human behaviour analysis
- human detection
- Human pose estimation
- human-robot interaction
- image rectification
- image retrieval
- image segmentation
- indexing
- information fusion
- information visualization
- internet of things
- involvement
- keyframe extraction
- language
- learning
- likelihood-based encoding
- listener categories
- machine learning
- manipulation
- Maya civilization
- Maya culture
- Maya glyph
- maya glyphs
- Metric learning
- microphone arrays
- Microphones
- mixed activity
- Monte Carlo methods
- motif mining
- multi-camera
- multi-object tracking
- Multimodal
- multimodal identification
- Multimodal interaction
- Multimodal person diarization
- multiple face tracking
- multiple sound sources
- multiple speaker detection
- multivariate time series
- network output
- neural nets
- neural network-based sound source localization methods
- neural networks
- non parametric models
- non-verbal cues
- Nonverbal behavior
- OCR
- online calibration.
- particle filter
- person diarization
- person discovery
- person identification
- person invariance
- Person Tracking
- plant skeleton
- pLSA
- Position measurement
- precision viticulture
- real-time
- remote
- remote recording
- remote sensing
- remote sensor
- representation learning
- RGB-D
- RGB-D camera
- RGB-D cameras
- road vehicles
- Robots
- saccade
- Sampling
- scene analysis
- segmentation
- shape classification
- Shape descriptor
- shape recognition
- Shape retrieval
- shot boundary detection
- simultaneous detection
- single sound source
- sketch
- skin colour
- social computing
- sound mixtures
- sound source localization
- sparse autoencoder
- sparse coding
- spatial spectrum-based approaches
- speaker
- Speaker Diarization
- Speaker identification
- speaker recognition
- speaker verification
- spectral shot clustering
- Speech
- surveillance
- topic models
- tracking
- training
- transfer learning
- triplet loss
- unsupervised
- Unsupervised · Latent sequential patterns · Topic models · PLSA · Video surveillance · Activity analysis
- unsupervised activity analysis
- unsupervised calibration
- unsupervised learning
- usability
- user study
- variational inference
- ve- hicle detection.
- VFOA
- video
- video processing
- video structuring
- vineyard
- virtual agents
- visual focus of attention
- Visual similarity
- weakly-supervised learning.
Publications of Jean-Marc Odobez
2007
Short-Term Spatio-Temporal Clustering Applied to Multiple Moving Speakers, and , in: IEEE Transactions on Audio, Speech and Language Processing, 15(5):15, 2007 |
|
Using Audio and Video Features to Classify the Most Dominant Person in a Group Meeting, , , , , , , , and , in: "", 2007 |
|
Using Audio and Video Features to Classify the Most Dominant Person in a Group Meeting, , , , , , , , and , Idiap-RR-29-2007 |
|
Visual Focus of Attention Estimation from Head Pose Posterior Probability Distributions, and , Idiap-RR-75-2007 |
|
2006
A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room, and , in: 3rd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI06), 2006 |
|
A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room, and , Idiap-RR-10-2006 |
|
Application of Information Retrieval Technologies to Presentation Slides, and , in: IEEE Transactions on Multimedia, 8(5), 2006 |
|
Audio-visual probabilistic tracking of multiple speakers in meetings, , , and , in: IEEE Trans. on Audio, Speech, and Language Processing, accepted for publication., 2006 |
|
Constructing visual models with a latent space approach, , , and , in: the Springer series of Lecture Notes in Computer Science, 2006 |
|
Embedding Motion in Model-Based Stochastic Tracking, , and , in: IEEE Transaction on Image Processing, 15(11), 2006 |
Integrating co-occurrence and spatial contexts on patch-based scene segmentation, , , and , in: Beyond Patches Workshop, in conjunction with CVPR, 2006 |
|
Natural Scene Image Modeling using Color and Texture Visterms., and , in: Conference on Image and Video Retrieval CIVR, 2006 |
|
Natural Scene Image Modeling using Color and Texture Visterms., and , Idiap-RR-17-2006 |
|
Recognizing People's Focus of Attention from Head Poses: a Study, and , Idiap-RR-42-2006 |
|
Tracking Attention for Multiple People: Wandering Visual Focus of Attention Estimation, , , and , Idiap-RR-40-2006 |
|
Tracking the Multi Person Wandering Visual Focus of Attention, , , and , in: International Conference on Multimodal Interfaces (ICMI06), 2006 |
|
2005
A Rao-Blackwellized Mixed State Particle Filter for Head Pose Tracking, and , in: ACM ICMI Workshop on Multimodal Multiparty Meeting Processing (MMMP), 2005 |
|
A Rao-Blackwellized Mixed State Particle Filter for Head Pose Tracking, and , Idiap-RR-35-2005 |
|
A Thousand Words in a Scene, , , and , Idiap-RR-40-2005 |
|
A Video Database for Head Pose Tracking Evaluation, and , Idiap-Com-04-2005 |
|
Application of Information Retrieval Technologies to Presentation Slides, and , Idiap-RR-36-2005 |
|
Audio-visual probabilistic tracking of multiple speakers in meetings, , , and , Idiap-RR-27-2005 |
|
AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking, , and , in: Proceedings of the 2004 MLMI Workshop, S. Bengio and H. Bourlard Eds, Springer Verlag, 2005 |
|
Constructing visual models with a latent space approach, , , and , Idiap-RR-14-2005 |
|
Evaluation of Multiple Cues Head Pose Tracking Algorithm in Indoor Environments, and , Idiap-RR-05-2005 |
|
Evaluation of Multiple Cues Head Pose Tracking Algorithm in Natural Environments, and , in: International Conference on Multimedia & Expo ICME 2005, 2005 |
|
Integrating co-occurrence and spatial contexts on patch-based scene segmentation, , , and , Idiap-RR-30-2005 |
|
Modeling Scenes with Local Descriptors and Latent Aspects, , , , , and , in: IEEE Int. Conf. on Computer Vision, 2005 |
|
Monte Carlo Video Text Segmentation, , and , in: International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI), 19(5), 2005 |
|
Multimodal Multispeaker Probabilistic Tracking in Meetings, , , and , in: Proc. Int. Conf. on Multimodal Interfaces (ICMI), 2005 |
|
OCR Based Slide Retrieval, , and , Idiap-RR-11-2005 |
|
Sports Event Recognition using Layered HMMs, and , Idiap-RR-07-2005 |
|
Threshold Selection for Unsupervised Detection, with an Application to Microphone Arrays, , , and , Idiap-RR-52-2005 |
|
Tracking People in Meetings with Particles, , , , and , in: Proc. Int. Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS,',','), invited paper, 2005 |
|
Tracking the Multi Person Wandering Visual Focus of Attention, , , and , Idiap-RR-80-2005 |
|
Video Text Recognition using Sequential Monte Carlo and Error Voting Methods, and , in: Pattern Recognition Letters, 26(9), 2005 |
|
2004
A Localization/Verification Scheme for Finding Text in Images and Video Frames Based on Contrast Independent Features and Machine Learning Methods, , and , in: Signal Processing: Image Communication, 19(3), 2004 |
|
A probabilistic framework for joint head tracking and pose estimation, and , in: 17th Int. Conf. Pattern Recognition (ICPR), 2004 |
|
Assessing Scene Structuring in Consumer Videos, , , , and , in: Int. Conf. on Image and Video Retrieval (CIVR), 2004 |
|
Assessing Scene Structuring in Consumer Videos, , , , and , Idiap-RR-11-2004 |
|
AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking, , and , Idiap-RR-28-2004 |
|
Embedding motion in model-based stochastic tracking, and , in: 17th Int. Conf. Pattern Recognition (ICPR), 2004 |
|
Fusion of Structural and Color Local Descriptors for Enhanced Object Recognition, and , in: Proceedings IEEE WIAMIS 2004(5th International Workshop on Image Analysis for Multimedia Interactive Services,',','), 21-23 April, 2004, Lisboa, Portugal, 2004 |
|
Modeling Scenes with Local Descriptors and Latent Aspects, , , , , and , Idiap-RR-79-2004 |
|
Motion likelihood and proposal modeling in Model-Based Stochastic Tracking, and , Idiap-RR-61-2004 |
|
Multimodal Multispeaker Probabilistic Tracking in Meetings, , , and , Idiap-RR-66-2004 |
|
Robust Playfield Segmentation using MAP Adaptation, and , in: Proc. 17th International Conference on Pattern Recognition (ICPR 2004), 2004 |
|
Short-Term Spatio-Temporal Clustering of Sporadic and Concurrent Events, , and , Idiap-RR-14-2004 |
|
Text Detection and Recognition in Images and Videos, , and , in: Pattern Recognition, 37(3), 2004 |
Tracking People in Meetings with Particles, , , , and , Idiap-RR-71-2004 |
|