<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">CONF</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">Dubagunta_ICASSP-2_2019/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">Learning voice source related information for depression detection</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Dubagunta, S. Pavankumar</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Vlasenko, Bogdan</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Magimai-Doss, Mathew</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">Convolutional Neural Networks</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">depression detection</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">glottal source signals.</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">zero-frequency filtering</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2="0">
			<subfield code="i">EXTERNAL</subfield>
			<subfield code="u">http://publications.idiap.ch/attachments/papers/2019/Dubagunta_ICASSP-2_2019.pdf</subfield>
			<subfield code="x">PUBLIC</subfield>
		</datafield>
		<datafield tag="711" ind1="2" ind2=" ">
			<subfield code="a">Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">2019</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">During depression neurophysiological changes can occur, which may affect laryngeal control i.e. behaviour of the vocal folds.  Characterising these changes in a precise manner from speech signals is a non trivial task, as this typically involves reliable separation of the voice source information from them. In this paper, by exploiting the abilities of CNNs to learn task-relevant information from the input raw signals, we investigate several methods to model voice source related information for depression detection. Specifically, we investigate modelling of low pass filtered speech signals, linear prediction residual signals, homomorphically filtered voice source signals and zero frequency filtered signals to learn voice source related information for depression detection. Our investigations show that subsegmental level modelling of linear prediction residual signals or zero frequency filtered signals leads to systems better than the state-of-the-art low level descriptor based systems and deep learning based systems modelling the vocal tract system information.</subfield>
		</datafield>
	</record>
</collection>