Keywords:
- 3D face model
- abnormality detection
- acoustic generators
- Acoustic signal processing
- activity
- appearance based methods
- Appearance based model
- appearance model
- Archaeology
- Artificial Neural Networks
- attention
- audio-visual speaker recognition
- Autism
- backchannels
- Bayesian modeling
- behavior analysis.
- bias correction
- blink
- bobbing estimation
- camera network
- Children
- clustering
- cognition
- computer vision
- Content-based multimedia indexing
- conversation
- convolutional network
- Convolutional neural network.
- Convolutional Neural Networks
- corpus
- covariance matrices
- Crowdsourcing
- Cultural heritage
- dataset
- deep learning
- Deep Metric Learning.
- deep neural networks
- Delays
- diarization
- Dimensionality reduction
- direction-of-arrival estimation
- DOA estimation
- domain adaptation
- embedding
- embedding learning
- Encoding
- entrainment to music
- Epigraphy
- Estimation
- eye movements
- eye tracking
- eye-gaze
- Face
- face clustering
- Face dirarization
- Face Recognition
- Face tracking
- Facial animations
- Feature extraction
- Feature-based tracking
- first impressions
- focus of attention
- gait
- Gaze
- Gaze Coding
- gaze detection
- Gaze estimation
- generative models
- geometric method
- grapevine pruning
- group dynamics
- HCI
- head nods
- Head pose
- Head pose tracking
- head-pose invariance
- HHI
- hieroglyph
- Histogram of orientation
- HOOSC
- HRI
- human activity recognition
- human behaviour analysis
- human detection
- Human pose estimation
- human-robot interaction
- image rectification
- image retrieval
- image segmentation
- indexing
- information fusion
- information visualization
- internet of things
- involvement
- keyframe extraction
- language
- learning
- likelihood-based encoding
- listener categories
- machine learning
- manipulation
- Maya civilization
- Maya culture
- Maya glyph
- maya glyphs
- Metric learning
- microphone arrays
- Microphones
- mixed activity
- Monte Carlo methods
- motif mining
- multi-camera
- multi-object tracking
- Multimodal
- multimodal identification
- Multimodal interaction
- Multimodal person diarization
- multiple face tracking
- multiple sound sources
- multiple speaker detection
- multivariate time series
- network output
- neural nets
- neural network-based sound source localization methods
- neural networks
- non parametric models
- non-verbal cues
- Nonverbal behavior
- OCR
- online calibration.
- particle filter
- person diarization
- person discovery
- person identification
- person invariance
- Person Tracking
- plant skeleton
- pLSA
- Position measurement
- precision viticulture
- real-time
- remote
- remote recording
- remote sensing
- remote sensor
- representation learning
- RGB-D
- RGB-D camera
- RGB-D cameras
- road vehicles
- Robots
- saccade
- Sampling
- scene analysis
- segmentation
- shape classification
- Shape descriptor
- shape recognition
- Shape retrieval
- shot boundary detection
- simultaneous detection
- single sound source
- sketch
- skin colour
- social computing
- sound mixtures
- sound source localization
- sparse autoencoder
- sparse coding
- spatial spectrum-based approaches
- speaker
- Speaker Diarization
- Speaker identification
- speaker recognition
- speaker verification
- spectral shot clustering
- Speech
- surveillance
- topic models
- tracking
- training
- transfer learning
- triplet loss
- unsupervised
- Unsupervised · Latent sequential patterns · Topic models · PLSA · Video surveillance · Activity analysis
- unsupervised activity analysis
- unsupervised calibration
- unsupervised learning
- usability
- user study
- variational inference
- ve- hicle detection.
- VFOA
- video
- video processing
- video structuring
- vineyard
- virtual agents
- visual focus of attention
- Visual similarity
- weakly-supervised learning.
Publications of Jean-Marc Odobez
2013
Person Independent 3D Gaze Estimation From Remote RGB-D Cameras, and , in: International Conference on Image Processing, Melbourne, Australia, IEEE, 2013 |
[DOI] |
Real-Time Audio-Visual Analysis for Multiperson Videoconferencing, , , , , , , , and , in: Advances in Multimedia, 2013:21, 2013 |
[DOI] [URL] |
The vernissage corpus: a conversational human-robot-interaction dataset, , , , , , , , , and , in: Proceedings of the 8th ACM/IEEE international conference on Human-robot interaction, 2013 |
|
Time-Sensitive Topic Models for Action Recognition in Videos, , and , in: IEEE International Conference on Image Processing, 2013 |
|
Unsupervised Methods for Activity Analysis and Detection of Abnormal Events, and , Idiap-RR-21-2013 |
|
Unsupervised methods for activity analysis and detection of abnormal events, and , in: Intelligent Video Surveillance Systems (ISTE), Wiley, 2013 |
[DOI] |
2012
Assessing Sparse Coding Methods for Contextual Shape Indexing of Maya Hieroglyphs, , and , in: Journal of Multimedia, 7(2):179--192, 2012 |
|
Bridging the Past, Present and Future: Modeling Scene Activities From Event Relationships and Global Rules, , and , in: IEEE Conference on Computer Vision and Pattern Recognition, 2012, Providence, Rhode Island, USA, 2012 |
Gaze Estimation From Multimodal Kinect Data, and , in: IEEE Conference in Computer Vision and Pattern Recognition, Workshop on Gesture Recognition, Providence, RI, USA, 2012 |
[DOI] |
Investigating the Midline Effect for Visual Focus of Attention Recognition, and , in: Int Conf. on Multimodal Interaction (ICMI), Santa Monica, 2012 |
|
Recognizing the Visual Focus of Attention for Human Robot Interaction, , and , in: IEEE International Conference on Intelligent Robots and Systems (IROS) - Human Behavior Understanding Workshop(IROS-HBU), 2012 |
|
Robot-to-group Interaction in a Vernissage: Architecture & Dataset for Multi-party Dialog, , , , , , , and , in: Proceedings of 5th International Conference on Cognitive Systems, 2012 |
|
Sampling techniques for audio-visual tracking and head pose estimation, and , in: Multimodal Signal Processing: Human Interactions in Meetings, pages 84-102, Cambridge University Press, 2012 |
|
Sparsity in Topic Models, , and , in: Practical Applications of Sparse Modeling: Biology, Signal Processing and Beyond, MIT Press, 2012 |
|
Statistical Shape Descriptors for Ancient Maya Hieroglyphs Analysis, , École Polytechnique Fédérale de Lausanne, 2012 |
|
The Vernissage Corpus: A Multimodal Human-Robot-Interaction Dataset, , , , , , , , , and , Idiap-RR-33-2012 |
|
Unsupervised Activity Analysis and Monitoring algorithms for Effective Surveillance Systems, , , , , , , and , in: European Conference on Computer Vision, 2012 |
|
Using self-context for multimodal detection of head nods in face-to-face interactions, , and , Idiap-RR-27-2012 |
|
Using Self-Context for Multimodal Detection of Head Nods in Face-to-Face Interactions, , and , in: Proceedings of the 14th ACM International Conference on Multimodal Interaction, 2012 |
|
We are not Contortionists: Coupled Adaptive Learning for Head and Body Orientation Estimation in Surveillance Video, and , in: IEEE International Conference on Computer Vision and Pattern Recognition, 2012 |
|
2011
3D human pose recovery from image by efficient visual feature selection, , , and , in: Computer Vision and Image Understanding, 115(3), 2011 |
|
A Bimodal Sound Source Model for Vehicle Tracking in Traffic Monitoring, , , and , in: European Signal Processing Conference, 2011 |
|
A Joint Estimation of Head and Body Orientation Cues in Surveillance Video, , and , in: IEEE International Workshop on Socially Intelligent Surveillance and Monitoring, 2011 |
|
Analyzing ancient Maya glyph collections with Contextual Shape Descriptors, , , and , in: International Journal of Computer Vision, 94(1):101-117, 2011 |
[DOI] |
Combined Estimation of Location and Body Pose in Surveillance Video, , and , in: AVSS, 2011 |
|
Detection-Based Multi-Human Tracking Using a CRF Model, , and , in: The Eleventh IEEE International Workshop on Visual Surveillance, 2011 |
|
Engagement-based Multi-party Dialog with a Humanoid Robot, , , , , , and , in: Proceedings of the SIGDIAL 2011: the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 341-343, 2011 |
|
Exploiting Long-Term Observations for Track Creation and Deletion in Online Multi-Face Tracking, and , Idiap-RR-01-2011 |
|
Exploiting Long-Term Observations for Track Creation and Deletion in Online Multi-Face Tracking, and , in: IEEE Conference on Automatic Face and Gesture Recognition, pages 525-530, IEEE, 2011 |
|
Extracting and Locating Temporal Motifs in Video Scenes Using a Hierarchical Non Parametric Bayesian Model, , and , in: IEEE Conference on Computer Vision and Pattern Recognition, 2011 |
|
Fast Human Detection from Joint Appearance and Foreground Feature Subset Covariances, and , in: Computer Vision and Image Understanding, 115(10):1414-1426, 2011 |
|
Joint Adaptive Colour Modelling and Skin, Hair and Clothing Segmentation Using Coherent Probabilistic Index Maps, and , in: British Machine Vision Conference, British Machine Vision Association, Dundee, UK, 2011 |
|
Multi-camera Open Space Human Activity Discovery for Anomaly Detection, , and , in: 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance, 2011 |
|
Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues, and , in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(1):101-116, 2011 |
|
New world, New Worlds: Visual Analysis of Pre-Columbian Pictorial Collections., , , and , in: Proceedings of the International Workshop on Multimedia for Cultural Heritage, Modena, Italy., Springer CCIS series book, 2011 |
|
Searching the Past: An Improved Shape Descriptor to Retrieve Maya Hieroglyphs., , , and , in: Proceedings of the ACM International Conference in Multimedia, Scottsdale, USA, ACM, 2011 |
|