<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">ARTICLE</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">Thimm-97.1/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">High Order and Multilayer Perceptron Initialization</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Thimm, Georg</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Fiesler, Emile</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">activation function</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">comparison of weight initialization methods</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">connectionism</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">high(er) order neural network</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">high(er) order perceptron</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">initial weight</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">initial weight distribution</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">interconnection strength</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">learning rate</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">multilayer perceptron</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">neural computation</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">neural network</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">neural network initialization</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">neurocomputing</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">optimization</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">random weight initialization</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">real-world benchmark</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">sigma-pi connection</subfield>
		</datafield>
		<datafield tag="653" ind1="1" ind2=" ">
			<subfield code="a">weight initialization</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2=" ">
			<subfield code="u">http://publications.idiap.ch/index.php/publications/showcite/Thimm-95.1</subfield>
			<subfield code="z">Related documents</subfield>
		</datafield>
		<datafield tag="773" ind1=" " ind2=" ">
			<subfield code="p">IEEE Transactions on Neural Networks</subfield>
			<subfield code="v">8</subfield>
			<subfield code="n">02</subfield>
			<subfield code="x">1045-9227</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">1997</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">Proper initialization is one of the most important prerequisites for fast convergence of feed-forward neural networks like high order and multilayer perceptrons. This publication aims at determining the optimal variance (or range) for the initial weights and biases, which is the principal parameter of random initialization methods for both types of neural networks. An overview of random weight initialization methods for multilayer perceptrons is presented. These methods are extensively tested using eight real- world benchmark data sets and a broad range of initial weight variances by means of more than $30,000$ simulations, in the aim to find the best weight initialization method for multilayer perceptrons. For high order networks, a large number of experiments (more than $200,000$ simulations) was performed, using three weight distributions, three activation functions, several network orders, and the same eight data sets. The results of these experiments are compared to weight initialization techniques for multilayer perceptrons, which leads to the proposal of a suitable initialization method for high order perceptrons. The conclusions on the initialization methods for both types of networks are justified by sufficiently small confidence intervals of the mean convergence times.</subfield>
		</datafield>
	</record>
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">CHAPTER</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">Thimm-95.1/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">Neural Network Initialization</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Thimm, Georg</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Fiesler, Emile</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Mira, J.</subfield>
			<subfield code="e">Ed.</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Sandoval, F.</subfield>
			<subfield code="e">Ed.</subfield>
		</datafield>
		<datafield tag="711" ind1="2" ind2=" ">
			<subfield code="a">From Natural to Artificial Neural Computation</subfield>
		</datafield>
		<datafield tag="440" ind1=" " ind2=" ">
			<subfield code="a">Lecture Notes in Computer Science</subfield>
		</datafield>
		<datafield tag="773" ind1=" " ind2=" ">
			<subfield code="v">930</subfield>
			<subfield code="c">535-542</subfield>
			<subfield code="z">3-540-59497-3</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">1995</subfield>
			<subfield code="b">Springer Verlag</subfield>
			<subfield code="a">Berlin</subfield>
		</datafield>
	</record>
</collection>