Keywords:
- ASR
- audio processing
- Automatic Speech Recognition
- binary masking
- constrained structural maximum a posteriori linear regression
- cross-lingual speaker adaptation
- data-driven enhancement
- decision tree marginalization
- decision trees
- dialogue
- Discourse Annotation
- domain adaptation
- hidden Markov models
- HMM state mapping
- HMM-based TTS
- Language Models
- Machine Translation
- microphone array
- minimum generation error
- multilingual acoustic modeling
- neural network
- overlapping speech recognition
- pattern matching
- personality impressions
- phonological constraints
- phonological knowledge
- regression class tree
- reliability estimation
- Representation and Processing
- speaker adaptation
- speech recognition
- speech separation
- speech synthesis
- Statistical parametric speech synthesis
- supervision
- temporal alignment
- time synchronisation
- time synchronization
- time-frequency analysis
- under-resourced languages
- unified models
- universal phoneme set
- unsupervised cross-lingual speaker adaptation
- verbal analysis
- vlogs
- vocal tract length normalization
- Web data
- youtube
Publications of John Dines
| 1 | 2 |
2009
Measuring the gap between HMM-based ASR and TTS, , and , in: Proceedings of Interspeech, Brighton, U.K., 2009 |
|
Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
|
Real-Time ASR from Meetings, , , , , , , , and , Idiap-RR-15-2009 |
|
Real-Time ASR from Meetings, , , , , , , , and , in: Proceedings of Interspeech, Brighton, UK., 2009 |
|
Speech recognition with speech synthesis models by marginalising over decision tree leaves, , and , Idiap-RR-17-2009 |
|
Speech recognition with speech synthesis models by marginalising over decision tree leaves, , and , in: Proceedings of Interspeech, Brighton, U.K., 2009 |
|
VTLN Adaptation for Statistical Speech Synthesis, , , and , Idiap-RR-41-2009 |
|
2008
A Neural Network based Regression Approach for Recognizing Simultaneous Speech, , , , and , Idiap-RR-10-2008 |
|
Adaptive Beamforming with a Maximum Negentropy Criterion, , , , , and , Idiap-RR-29-2008 |
|
Maximum kurtosis beamforming with the generalized sidelobe canceller, , , , , and , in: Proceedings of INTERSPEECH, September 2008, Brisbane, Australia, 2008 |
|
Neural Network based Regression for Robust Overlapping Speech Recognition using Microphone Arrays, , , and , Idiap-RR-09-2008 |
|
Role Recognition in Multiparty Recordings using Social Affiliation Networks and Discrete Distributions, , , and , in: International Conference on Multimodal Interfaces, Chania, Greece, 2008 |
|
2007
A study of phoneme and grapheme based context-dependent ASR systems, and , Idiap-RR-12-2007 |
|
Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics, and , Idiap-RR-13-2007 |
|
MLP-based Log Spectral Energy Mapping for Robust Overlapping Speech Recognition, , , and , Idiap-RR-54-2007 |
|
Robust overlapping speech recognition based on neural networks, , and , Idiap-RR-55-2007 |
|
2006
A Generalized Dynamic Composition Algorithm of Weighted Finite State Transducers for Large Vocabulary Speech Recognition, , and , Idiap-RR-62-2006 |
|
Juicer: A Weighted Finite-State Transducer speech decoder, , , , , and , in: 3rd Joint Workshop on Multimodal Interaction and Related Machine LEarning Algorithms MLMI'06, 2006 |
|
Juicer: A Weighted Finite-State Transducer speech decoder, , , , , and , Idiap-RR-21-2006 |
|
The segmentation of multi-channel meeting recordings for automatic speech recognition, , and , in: Int. Conf. on Spoken Language Processing (Interspeech ICSLP), 2006 |
|
The segmentation of multi-channel meeting recordings for automatic speech recognition, , and , Idiap-RR-22-2006 |
|
2005
Improving Continuous Speech Recognition System Performance with Grapheme Modelling, , , and , Idiap-RR-16-2005 |
|
2004
On the Use of Information Retrieval Measures for Speech Recognition Evaluation, , , , , , and , Idiap-RR-73-2004 |
|
Phoneme vs Grapheme Based Automatic Speech Recognition, , , and , Idiap-RR-48-2004 |
|
Using RASTA in task independent TANDEM feature extraction, , and , in: Proceedings of ICSLP, 2004, 2004 |
|
Using RASTA in task independent TANDEM feature extraction, , and , Idiap-RR-22-2004 |
|
| 1 | 2 |