Biologically Inspired Spiking Neural Networks for Speech Recognition

Type of publication:	Thesis
Citation:	Bittar_THESIS_2024
Year:	2024
Month:	November
School:	EPFL/EDEE
DOI:	10.5075/epfl-thesis-10659
Abstract:	Biological neural networks, driving cognitive processes in the human brain, have long been a source of inspiration for computational models. Drawing from the physiology of neural dynamics, spiking neural networks stand out as prominent candidates for replicating and understanding the brain's functionality through efficient information processing. In this thesis, we investigate spiking neural networks during sequential processing by leveraging deep learning frameworks to train and evaluate them on speech recognition tasks. Focusing on the acoustic model, our approach captures the temporal patterns and phonetic features inherent to speech signals, providing insights into speech processing throughout the human auditory pathway. A first part is dedicated to conventional artificial neural networks and their utilisation of recurrence to address context dependencies and develop a form of working memory. Building upon a recent probabilistic derivation of recurrent neural networks, our research extends the approach and yields novel interpretable deep learning modules. While the main contributions remain theoretical, the resulting lightweight Bayesian recurrent units are shown to improve speech recognition performance compared to standard recurrent neural networks. In a second part, we shift to the main focus of spiking neural networks that encode and transmit information via sparse and binary spike sequences. Using the surrogate gradient method, we formulate physiologically inspired architectures as a special case of recurrent neural networks. This enables us to bootstrap a study of spiking neural networks from existing deep learning frameworks. Here we also explore the role of recurrence and memory in the form of different feedback mechanisms including layer-wise recurrent connections and unit-wise spike frequency adaptation in the neuron model. While the main aim is to develop the understanding of physiological processes, our results on speech recognition tasks also contribute to the field of energy-efficient neuromorphic technology. Lastly, an analysis of our trained spiking architectures reveals the replication of key features observed in biological networks, offering a novel and scalable approach for their study. In particular, we explore the phenomenon of neural oscillations, characteristic of cognitive processes in the brain. Our analysis confirms the presence of cross-frequency couplings in the trained networks during speech processing, notably between theta and gamma frequency bands. This synchronisation of the spiking activity is shown to arise naturally, simply through gradient descent training, and is enhanced by the incorporation of recurrent mechanisms. In summary, this thesis contributes to the field of neural network research by offering insights into the concepts of recurrence and spiking dynamics during sequential processing tasks, particularly in the context of speech recognition. Through a combination of theoretical analysis and practical experimentation, we develop novel methods, deep learning modules, and physiologically inspired architectures that advance our understanding of neural computation and its applications. By replicating key features observed in biological networks, our research contributes to future developments in neuromorphic computing and cognitive science.
Keywords:	Bayesian recurrent units, neural oscillations, neuromorphic technology, Recurrent neural networks, speech recognition, spiking neural networks, surrogate gradient
Projects	Idiap
Authors	Bittar, Alexandre
Added by:	[UNK]
Total mark:	0
Attachments
Bittar_THESIS_2024.pdf
Notes

processing time: 0.0003 seconds.