<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">CONF</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">Roy_ICASSP11_2011/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">Phoneme Recognition using Boosted Binary Features</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Roy, Anindya</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Magimai-Doss, Mathew</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Marcel, Sébastien</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2="0">
			<subfield code="i">EXTERNAL</subfield>
			<subfield code="u">http://publications.idiap.ch/attachments/papers/2011/Roy_ICASSP11_2011.pdf</subfield>
			<subfield code="x">PUBLIC</subfield>
		</datafield>
		<datafield tag="711" ind1="2" ind2=" ">
			<subfield code="a">IEEE Intl. Conference on Acoustics, Speech and Signal Processing 2011</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">2011</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">In this paper, we propose a novel parts-based binary-valued feature for ASR. This
feature is extracted using boosted
ensembles of simple threshold-based classifiers. Each such classifier
looks at a specific pair of
time-frequency bins located on the spectro-temporal plane.
These features termed as Boosted Binary Features (BBF) are integrated
into standard HMM-based system by using
multilayer perceptron (MLP) and single layer perceptron (SLP).
Preliminary studies on TIMIT phoneme recognition task show that
BBF yields similar or better performance compared to MFCC
(67.8% accuracy for BBF vs. 66.3% accuracy for MFCC) using MLP,
while it yields significantly better performance than MFCC (62.8%
accuracy for BBF vs. 45.9% for MFCC) using SLP. This demonstrates the
potential of the proposed feature for speech recognition.</subfield>
		</datafield>
	</record>
</collection>