CONF Yao_ECCV-VS_2008/IDIAP Fast human detection from videos using covariance features Yao, Jian Odobez, Jean-Marc EXTERNAL https://publications.idiap.ch/attachments/papers/2008/Yao_ECCV-VS_2008.pdf PUBLIC https://publications.idiap.ch/index.php/publications/showcite/Yao_Idiap-RR-68-2007 Related documents European Conference on Computer Vision, workshop on Visual Surveillance (ECCV-VS) Marseille 2008 October 2008 In this paper, we present a fast method to detect humans from videos captured in surveillance applications. It is based on a cascade of LogitBoost classifiers relying on features mapped from the Riemanian manifold of region covariance matrices computed from input image features. The method was extended in several ways. First, as the mapping process is slow for high dimensional feature space, we propose to select weak classifiers based on subsets of the complete image feature space. In addition, we propose to combine these sub-matrix covariance features with the means of the image features computed within the same subwindow, which are readily available from the covariance extraction process. Finally, in the context of video acquired with stationary cameras, we propose to fuse image features from the spatial and temporal domains in order to jointly learn the correlation between appearance and foreground information based on background subtraction. Our method evaluated on a large set of videos coming from several databases (CAVIAR, PETS, ...,',','), and can process from 5 to 20 frames/sec (for a 384x288 video) while achieving similar or better performance than existing methods. REPORT Yao_Idiap-RR-68-2007/IDIAP Fast Human Detection from Videos Using Covariance Features Yao, Jian Odobez, Jean-Marc EXTERNAL https://publications.idiap.ch/attachments/reports/2007/Yao_Idiap-RR-68-2007.pdf PUBLIC Idiap-RR-68-2007 2007 Idiap December 2007 In this paper, we present a fast method to detect humans from videos captured in surveillance applications. It is based on a cascade of LogitBoost classifiers relying on features mapped from the Riemanian manifold of region covariance matrices computed from input image features. The method was extended in several ways. First, as the mapping process is slow for high dimensional input image feature space, we propose to select weak classifiers based on subsets of the complete image feature space, corresponding to sub-matrices of the full covariance matrix. In addition, we propose to combine these sub-matrix covariance features with the means of the image features computed within the same subwindow, which are readily available from the fast covariance extraction process based on integral images. Finally, in the context of video acquired with stationary cameras, we propose to fuse image features from the spatial and temporal domains in order to take advantage of both appearance and foreground information based on background subtraction to detect humans. We evaluated our method on a large dataset of videos coming from several databases (CAVIAR, PETS, ...). The results show that our approach can process from 5 to 20 frames/second (for a 384x288 video) while achieving similar performance than existing methods.