Keywords:
- accent embedding
- Accented speech
- acoustic modeling
- Anti-spoofing
- Articulation
- ASR-free
- audio&text embeddings
- Automatic accent assessment
- Automatic accent evaluation
- Automatic prosodic event detection
- automatic reading tutor
- Binary pattern matching
- Bob toolbox
- child speech recognition
- cognition
- Compressive sampling
- computer aided learning
- connectionist temporal classification
- continuous F0 coding
- Convolutional Neural Networks
- Deep neural network (DNN)
- deep neural networks
- dynamic programming
- Dysarthria
- end-to-end
- Fast $k$NN
- intelligibility
- Kaldi toolkit
- keyword spotting
- KL-divergence
- KL-HMM
- Kullback-Leibler divergence
- laboratory phonology
- lan- guage identification
- lexical model
- Linguistic parsing
- low bit rate speech coding
- Low bit rate speech vocoding
- multi-task
- Multilingual automatic speech recognition
- nasal sounds
- nearest neighbour rule of classification.
- neural computing
- non-modal phonation
- non-native speech
- open science
- open vocabulary
- parametric speech synthesis
- Parametric vocoding
- Parkinson's disease
- Phonation
- Phone attributes
- Phoneme classification
- phonetic representation
- Phonological features
- phonological posteriors
- Phonological speech representation
- phonological vocoding
- phonology
- pitch analysis
- Posterior features
- Posterior representatives
- probabilistic amplitude demodulation
- prosody
- python
- Quantized posterior hashing
- Reproducible research
- software
- speaker verification
- spectral amplitude modulation phase hierarchy
- Speech Analysis
- speech coding
- speech emphasis
- speech perception
- speech processing
- speech production
- speech prosody
- speech recognition
- speech synthesis
- spiking neural networks
- Structured sparse representation
- Structured sparsity
- triphone mapping
- Very low bit rate speech coding
- word emphasis
Publications of Milos Cernak sorted by journal and type
| 1 | 2 |
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING
Incremental Syllable-Context Phonetic Vocoding, , , , and , in: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 23(6), 2015 |
[URL] |
Speech Communication
On Structured Sparsity of Phonological Posteriors for Linguistic Parsing, , and , in: Speech Communication, 84:36-45, 2016 |
[DOI] [URL] |
Handbook of Biometric Anti-Spoofing (2019)
Voice Presentation Attack Detection Using Convolutional Neural Networks, , , , , and , in: Handbook of Biometric Anti-Spoofing, pages 391--415, Springer, 2019 |
[URL] |
International Conference on Speech and Language Processing, Interspeech (2019)
End-to-End Accented Speech Recognition, , and , in: International Conference on Speech and Language Processing, Interspeech, ISCA, Graz, Austria, pages 2140-2144, 2019 |
[DOI] |
Proceedings of Interspeech 2019 (2019)
Open-Vocabulary Keyword Spotting With Audio And Text Embeddings, , , and , in: Proceedings of Interspeech 2019, 2019 |
[DOI] |
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (2018)
NASAL SPEECH SOUNDS DETECTION USING CONNECTIONIST TEMPORAL, and , in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2018 |
|
Proc. of Interspeech (2017)
Bob Speaks Kaldi, , , , and , in: Proc. of Interspeech, 2017 |
|
Proceedings of 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017) (2017)
Multi-view Representation Learning Via GCCA for Multimodal Analysis of Parkinson's Disease, , , , , , , , , , , , , , and , in: Proceedings of 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017), 2017 |
|
On the Impact of Non-modal Phonation On Phonological Features, , , , , , , , , , , , , and , in: Proceedings of 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017), 2017 |
|
Workshop on Signal Processing with Adaptive Sparse Structured Representations (SPARS) (2017)
Sparse Pronunciation Codes for Perceptual Phonetic Information Assessment, , , and , in: Workshop on Signal Processing with Adaptive Sparse Structured Representations (SPARS), 2017 |
|
Interspeech (2016)
Efficient Posterior Exemplar Search Space Hashing Exploiting Class-Specific Sparsity Structures, , , and , in: Interspeech, San Francisco, CA, 2016 |
|
Proceedings of Interspeech (2016)
HMM-based Non-native Accent Assessment using Posterior Features, , and , in: Proceedings of Interspeech, San Francisco, USA, 2016 |
|
9th ISCA Speech Synthesis Workshop (2016)
Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis, , , and , in: 9th ISCA Speech Synthesis Workshop, 2016 |
|
Proc. of EUSIPCO (2016)
Modeling Unvoiced Sounds In Statistical Parametric Speech Synthesis with a Continuous Vocoder, , , and , in: Proc. of EUSIPCO, Budapest, Hungary, 2016 |
|
Proceeding on the 7th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT) (2016)
PAoS Markers: Trajectory Analysis of Selective Phonological Posteriors for Assessment of Progressive Apraxia of Speech, , and , in: Proceeding on the 7th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT), 2016 |
|
Interspeech (2016)
PhonVoc: A Phonetic and Phonological Vocoding Toolkit, and , in: Interspeech, San Francisco, USA, 2016 |
|
Proceedings of Interspeech (2016)
Probabilistic Amplitude Demodulation Features in Speech Synthesis for Improving Prosody, , and , in: Proceedings of Interspeech, San Francisco, USA, 2016 |
|
Interspeech (2016)
Sound Pattern Matching for Automatic Prosodic Event Detection, , , , and , in: Interspeech, San Francisco, USA, 2016 |
|
Proc. of Interspeech (2015)
An Empirical Model of Emphatic Word Detection, and , in: Proc. of Interspeech, Dresden, Germany, pages 573-577, ISCA, 2015 |
|
Proceedings of Interspeech (2015)
Automatic Accentedness Evaluation of Non-Native Speech Using Phonetic and Sub-Phonetic Posterior Probabilities, , , and , in: Proceedings of Interspeech, 2015 |
|
Proc. of Interspeech (2015)
Neuromorphic Based Oscillatory Device for Incremental Syllable Boundary Detection, and , in: Proc. of Interspeech, Dresden, Germany, pages 1191-1195, ISCA, 2015 |
|
Proceeding of Interspeech (2015)
On Compressibility of Neural Network phonological Features for Low Bit Rate Speech Coding, , and , in: Proceeding of Interspeech, pages 418-422, ISCA, 2015 |
|
IEEE 40th International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2015)
Phonological Vocoding Using Artificial Neural Networks, , and , in: IEEE 40th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, pages 4844-4848, IEEE, 2015 |
[DOI] |
Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014) (2014)
Development of Bilingual ASR System for MediaParl Corpus, , , and , in: Proceedings of the 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), Singapore, ISCA, 2014 |
|
Interspeech (2014)
Stress and Accent Transmission In HMM-Based Syllable-Context Very Low Bit Rate Speech Coding, , , and , in: Interspeech, 2014 |
|
International Conference on Affective Computing and Intelligent Interaction (2013)
Automatic Staging of Audio with Emotions, and , in: International Conference on Affective Computing and Intelligent Interaction, 2013 |
Proceedings of the IEEE Intl. Conference on Acoustics, Speech and Signal Processing (ICASSP) (2013)
On the (Un)importance of the Contextual Factors In HMM-Based Speech Synthesis, , and , in: Proceedings of the IEEE Intl. Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, pages 8140 - 8143, 2013 |
|
Proc. of Interspeech 2013 (2013)
Syllable-based Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture, , and , in: Proc. of Interspeech 2013, Lyon, France, 2013 |
|
Workshop on Child, Computer and Interaction (2012)
Reading Companion: The Technical and Social Design of an Automated Reading Tutor, , , , , and , in: Workshop on Child, Computer and Interaction, Portland, Oregon, U.S.A., 2012 |
|
Proceedings of Interspeech (2012)
Robust triphone mapping for acoustic modeling, , and , in: Proceedings of Interspeech, Portland, Oregon, 2012 |
|
| 1 | 2 |