<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">CONF</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">misr04/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">Spectral Entropy Based Feature for Robust ASR</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Misra, Hemant</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Ikbal, Shajith</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Bourlard, Hervé</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Hermansky, Hynek</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2="0">
			<subfield code="i">EXTERNAL</subfield>
			<subfield code="u">http://publications.idiap.ch/attachments/reports/2003/rr03-56.pdf</subfield>
			<subfield code="x">PUBLIC</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2=" ">
			<subfield code="u">http://publications.idiap.ch/index.php/publications/showcite/misra-rr-03-56</subfield>
			<subfield code="z">Related documents</subfield>
		</datafield>
		<datafield tag="711" ind1="2" ind2=" ">
			<subfield code="a">Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">2004</subfield>
			<subfield code="a">Montreal, Canada</subfield>
		</datafield>
		<datafield tag="771" ind1="2" ind2=" ">
			<subfield code="d">May 2004</subfield>
		</datafield>
		<datafield tag="500" ind1=" " ind2=" ">
			<subfield code="a">IDIAP-RR 2003 56</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">In general, entropy gives us a measure of the number of bits required to represent some information. When applied to probability mass function (PMF,',','),
 entropy can also be used to measure the ``peakiness'' of a distribution. In this paper, we propose using the entropy of short time Fourier transform spectrum, normalised as PMF, as an additional feature for automatic speech recognition (ASR). It is indeed expected that a peaky spectrum, representation of clear formant structure in the case of voiced sounds, will have low entropy, while a flatter spectrum corresponding to non-speech or noisy regions will have higher entropy. Extending this reasoning further, we introduce the idea of multi-band/multi-resolution entropy feature where we divide the spectrum into equal size sub-bands and compute entropy in each sub-band. The results presented in this paper show that multi-band entropy features used in conjunction with normal cepstral features improve the performance of ASR system.</subfield>
		</datafield>
	</record>
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">REPORT</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">misra-rr-03-56/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">Spectral Entropy Based Feature for Robust ASR</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Misra, Hemant</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Ikbal, Shajith</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Bourlard, Hervé</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Hermansky, Hynek</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2="0">
			<subfield code="i">EXTERNAL</subfield>
			<subfield code="u">http://publications.idiap.ch/attachments/reports/2003/rr03-56.pdf</subfield>
			<subfield code="x">PUBLIC</subfield>
		</datafield>
		<datafield tag="088" ind1=" " ind2=" ">
			<subfield code="a">Idiap-RR-56-2003</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">2003</subfield>
			<subfield code="b">IDIAP</subfield>
			<subfield code="a">Martigny, Switzerland</subfield>
		</datafield>
		<datafield tag="500" ind1=" " ind2=" ">
			<subfield code="a">in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing {(ICASSP)}, 2004</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">In general, entropy gives us a measure of the number of bits required to represent some information. When applied to probability mass function (PMF,',','),
 entropy can also be used to measure the ``peakiness'' of a distribution. In this paper, we propose using the entropy of short time Fourier transform spectrum, normalised as PMF, as an additional feature for automatic speech recognition (ASR). It is indeed expected that a peaky spectrum, representation of clear formant structure in the case of voiced sounds, will have low entropy, while a flatter spectrum corresponding to non-speech or noisy regions will have higher entropy. Extending this reasoning further, we introduce the idea of multi-band/multi-resolution entropy feature where we divide the spectrum into equal size sub-bands and compute entropy in each sub-band. The results presented in this paper show that multi-band entropy features used in conjunction with normal cepstral features improve the performance of ASR system.</subfield>
		</datafield>
	</record>
</collection>