<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">REPORT</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">Motlicek_Idiap-RR-01-2020/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR  SPEECH PRIVACY AND DIAGNOSIS</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Motlicek, Petr</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Hermansky, Hynek</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Madikeri, Srikanth</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Prasad, Amrutha</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Ganapathy, Sriram</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">AM</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">Automatic Speech Recognition</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">FM</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">Linear prediction</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">speaker recognition</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2="0">
			<subfield code="i">EXTERNAL</subfield>
			<subfield code="u">http://publications.idiap.ch/attachments/reports/2019/Motlicek_Idiap-RR-01-2020.pdf</subfield>
			<subfield code="x">PUBLIC</subfield>
		</datafield>
		<datafield tag="088" ind1=" " ind2=" ">
			<subfield code="a">Idiap-RR-01-2020</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">2020</subfield>
			<subfield code="b">Idiap</subfield>
			<subfield code="a">Rue Marconi 19</subfield>
		</datafield>
		<datafield tag="771" ind1="2" ind2=" ">
			<subfield code="d">January 2020</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">Although current trends in speech processing consider deep learning through data-driven technologies, many potential applications exhibit lack of training or development data. Therefore, considerably light signal processing techniques are still of interest. This paper describes an efficient technique for decomposing the AM and FM components of the speech signal, which is not based on frame-by-frame short-time analysis of the signal. Instead, we estimate all-pole models of frequency-localized Hilbert envelopes of large segments of speech signal at different frequencies. The technique on decomposition of speech signal into AM and FM components appears to be of interest in voice studies benefiting from alleviation of the message-bearing components of speech (e.g. security oriented applications such as speaker recognition, or speech diagnosis often relying on spectra averaging to discard the content of the speech). Similarly, discarding speaker information while preserving the message in the speech is of interest for privacy-oriented applications. Experimental results on automatic speech and speaker recognition tasks clearly show that the AM component preserves the content (message) of the speech, while the FM component carries the information related to the speaker.</subfield>
		</datafield>
	</record>
</collection>