CONF
Povey_ASRU2011_2011/IDIAP
The Kaldi Speech Recognition Toolkit
Povey, Daniel
Ghoshal, Arnab
Boulianne, Gilles
Burget, Lukas
Glembek, Ondrej
Goel, Nagendra
Hannemann, Mirko
Motlicek, Petr
Qian, Yanmin
Schwarz, Petr
Silovsky, Jan
Stemmer, Georg
Vesely, Karel
ASR
Automatic Speech Recognition
GMM
HTK
SGMM
EXTERNAL
https://publications.idiap.ch/attachments/papers/2012/Povey_ASRU2011_2011.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/Povey_Idiap-RR-04-2012
Related documents
IEEE 2011 Workshop on Automatic Speech Recognition and Understanding
Hilton Waikoloa Village, Big Island, Hawaii, US
2011
IEEE Signal Processing Society
IEEE Catalog No.: CFP11SRW-USB
978-1-4673-0366-8
We describe the design of Kaldi, a free, open-source
toolkit for speech recognition research. Kaldi provides a speech
recognition system based on finite-state transducers (using the
freely available OpenFst), together with detailed documentation
and scripts for building complete recognition systems. Kaldi
is written is C++, and the core library supports modeling of
arbitrary phonetic-context sizes, acoustic modeling with subspace
Gaussian mixture models (SGMM) as well as standard Gaussian
mixture models, together with all commonly used linear and
affine transforms. Kaldi is released under the Apache License
v2.0, which is highly nonrestrictive, making it suitable for a wide
community of users.
REPORT
Povey_Idiap-RR-04-2012/IDIAP
The Kaldi Speech Recognition Toolkit
Povey, Daniel
Ghoshal, Arnab
Boulianne, Gilles
Burget, Lukas
Glembek, Ondrej
Goel, Nagendra
Hannemann, Mirko
Motlicek, Petr
Qian, Yanmin
Schwarz, Petr
Silovsky, Jan
Stemmer, Georg
Vesely, Karel
ASR
Automatic Speech Recognition
GMM
HTK
SGMM
Idiap-RR-04-2012
2012
Idiap
Rue Marconi 19, Martigny
January 2012
We describe the design of Kaldi, a free, open-source
toolkit for speech recognition research. Kaldi provides a speech
recognition system based on finite-state automata (using the freely
available OpenFst), together with detailed documentation and a
comprehensive set of scripts for building complete recognition
systems. Kaldi is written is C++, and the core library supports
modeling of arbitrary phonetic-context sizes, acoustic modeling
with subspace Gaussian mixture models (SGMM) as well as
standard Gaussian mixture models, together with all commonly
used linear and affine transforms. Kaldi is released under the
Apache License v2.0, which is highly nonrestrictive, making it
suitable for a wide community of users.