An Online Audio Indexing System - Idiap Publications

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Idiap-RR
Citation:	ajmera-rr-03-39
Number:	Idiap-RR-39-2003
Year:	2003
Institution:	IDIAP
Note:	Accepted for publication in ICSLP 2004
Abstract:	This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives continuously, the system first finds boundaries of the acoustically homogenous segments. Next, each of these segments is classified as speech, music or {\it mixture} classes, where mixtures are defined as regions where speech and other non-speech sounds are present simultaneously and noticeably. The speech segments are then clustered together to provide consistent speaker labels. The speech and mixture segments are converted to text via an ASR system. The resulting words are time-stamped together with other metadata information (speaker identity, speech confidence score) in an XML file to rapidly identify and access target segments. In this paper, we analyze the performance at each stage of this audio indexing system and also compare it with the performance of the corresponding offline modules.
Userfields:	ipdmembership={speech},
Keywords:
Projects	Idiap
Authors	Ajmera, Jitendra McCowan, Iain A. Bourlard, Hervé
Crossref by	ajmera-rr-03-39b
Added by:	[UNK]
Total mark:	0
Attachments
rr03-39.pdf rr03-39.ps.gz
Notes

processing time: 0.0003 seconds.