<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">CONF</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">Dey_INTERSPEECH2019_2019/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Dey, Subhadeep</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Motlicek, Petr</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Bui, Trung</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Dernoncourt, Franck</subfield>
		</datafield>
		<datafield tag="711" ind1="2" ind2=" ">
			<subfield code="a">Proc. of Interspeech 2019</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">2019</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">In this paper, we explore various approaches for semi-
supervised learning in an end-to-end automatic speech recog-
nition (ASR) framework. The first step in our approach in-
volves training a seed model on the limited amount of labelled
data. Additional unlabelled speech data is employed through a
data-selection mechanism to obtain the best hypothesized out-
put, further used to retrain the seed model. However, uncer-
tainties of the model may not be well captured with a single
hypothesis. As opposed to this technique, we apply a dropout
mechanism to capture the uncertainty by obtaining multiple hy-
pothesized text transcripts of an speech recording. We assume
that the diversity of automatically generated transcripts for an
utterance will implicitly increase the reliability of the model.
Finally, the data-selection process is also applied on these hy-
pothesized transcripts to reduce the uncertainty. Experiments
on freely-available TEDLIUM corpus and proprietary Adobe’s
internal dataset show that the proposed approach significantly
reduces ASR errors, compared to the baseline model.</subfield>
		</datafield>
	</record>
</collection>