Keywords:
- ASR
- audio processing
- Automatic Speech Recognition
- binary masking
- constrained structural maximum a posteriori linear regression
- cross-lingual speaker adaptation
- data-driven enhancement
- decision tree marginalization
- decision trees
- dialogue
- Discourse Annotation
- domain adaptation
- hidden Markov models
- HMM state mapping
- HMM-based TTS
- Language Models
- Machine Translation
- microphone array
- minimum generation error
- multilingual acoustic modeling
- neural network
- overlapping speech recognition
- pattern matching
- personality impressions
- phonological constraints
- phonological knowledge
- regression class tree
- reliability estimation
- Representation and Processing
- speaker adaptation
- speech recognition
- speech separation
- speech synthesis
- Statistical parametric speech synthesis
- supervision
- temporal alignment
- time synchronisation
- time synchronization
- time-frequency analysis
- under-resourced languages
- unified models
- universal phoneme set
- unsupervised cross-lingual speaker adaptation
- verbal analysis
- vlogs
- vocal tract length normalization
- Web data
- youtube
Publications of John Dines sorted by title
| 1 | 2 |
A
A Comparison of Supervised and Unsupervised Cross-Lingual Speaker Adaptation Approaches for HMM-Based Speech Synthesis, , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Dallas, U.S.A., 2010 |
![]() |
A Comparison of Supervised and Unsupervised Cross-Lingual Speaker Adaptation Approaches for HMM-Based Speech Synthesis, , and , Idiap-RR-05-2010 |
![]() |
A Generalized Dynamic Composition Algorithm of Weighted Finite State Transducers for Large Vocabulary Speech Recognition, , and , Idiap-RR-62-2006 |
![]() |
A Neural Network based Regression Approach for Recognizing Simultaneous Speech, , , , and , Idiap-RR-10-2008 |
![]() |
A study of phoneme and grapheme based context-dependent ASR systems, and , Idiap-RR-12-2007 |
![]() |
Adaptive Beamforming with a Maximum Negentropy Criterion, , , , , and , Idiap-RR-29-2008 |
![]() |
An Analysis of Language Mismatch in HMM State Mapping-Based Cross-Lingual Speaker Adaptation, and , Idiap-RR-16-2010 |
![]() |
An Analysis of Language Mismatch in HMM State Mapping-Based Cross-Lingual Speaker Adaptation, and , in: Proceedings of Interspeech, Makuhari, Japan, 2010 |
![]() |
Applying multi- and cross-lingual stochastic phone space transformations to non-native speech recognition, , , , and , in: IEEE Transactions on Audio, Speech, and Language Processing, 2013 |
![]() [DOI] |
Automatic Temporal Alignment of AV Data, , and , Idiap-RR-39-2009 |
![]() |
Automatic Temporal Alignment of AV Data with Confidence Estimation, , and , Idiap-RR-40-2009 |
![]() |
Automatic Temporal Alignment of AV Data with Confidence Estimation, , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA, 2010 |
![]() |
B
Bias Adaptation for Vocal Tract Length Normalization, , , and , Idiap-RR-12-2013 |
![]() |
C
COMBINING VOCAL TRACT LENGTH NORMALIZATION WITH HIERARCHIAL LINEAR TRANSFORMATIONS, , , and , in: Proceedings in International conference on Speech and Signal processing, Kyoto, Japan, pages 4493-4496, IEEE SPS (ICASSP), 2012 |
![]() |
Combining Vocal Tract Length Normalization with Hierarchical Linear Transformations, , , and , in: IEEE Journal of Selected Topics in Signal Processing - Special Issue on Statistical Parametric Speech Synthesis, 8(2):262 - 272, 2014 |
![]() [DOI] |
Combining Vocal Tract Length Normalization with Linear Transformations in a Bayesian Framework, , , and , Idiap-RR-11-2012 |
![]() |
Comparing different acoustic modeling techniques for multilingual boosting, , , , and , in: Proceedings of Interspeech, Portland, Oregon, 2012 |
![]() |
Comparing different acoustic modeling techniques for multilingual boosting, , , , and , Idiap-RR-01-2013 |
![]() |
Current trends in multilingual speech processing, , , , , , , , and , in: Sadhana, 36(5):885–915, 2011 |
![]() [DOI] [URL] |
D
Decision tree clustering for KL-HMM, and , Idiap-Com-01-2012 |
![]() |
Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics, and , Idiap-RR-13-2007 |
![]() |
Domain-specific language model adaptation: a case study, , and , Idiap-Com-01-2013 |
![]() |
E
Enhancing State Mapping-Based Cross-Lingual Speaker Adaptation using Phonological Knowledge in a Data-Driven Manner, and , Idiap-RR-08-2013 |
![]() |
F
Feature Mapping of Multiple Beamformed Sources for Robust Overlapping Speech Recognition Using a Microphone Array, , , , , , and , Idiap-RR-17-2014 |
![]() |
H
Hi YouTube! Personality Impressions and Verbal Content in Social Video, , , and , in: 15th ACM International Conference on Multimodal Interaction, Sydney, Australia, ACM, 2013, 2013 |
![]() |
I
Impact du degré de supervision sur l'adaptation à un domaine d'un modèle de langage à partir du Web, , , and , in: Actes de la conference conjointe JEP-TALN-RECITAL 2012, Grenoble, France, pages 193-200, ATALA/AFCP, 2012 |
![]() |
Impact du degré de supervision sur l'adaptation à un domaine d'un modèle de langage à partir du Web, , , and , Idiap-RR-23-2012 |
![]() |
Implementation of VTLN for Statistical Speech Synthesis, , , and , Idiap-RR-32-2010 |
![]() |
Implementation of VTLN for Statistical Speech Synthesis, , , and , in: Proceedings of ISCA Speech Synthesis Workshop, Kyoto, Japan, 2010 |
![]() |
Improving Continuous Speech Recognition System Performance with Grapheme Modelling, , , and , Idiap-RR-16-2005 |
![]() |
Improving non-native ASR through stochastic multilingual phoneme space transformations, , , , and , Idiap-RR-19-2011 |
![]() |
Improving non-native ASR through stochastic multilingual phoneme space transformations, , , , and , in: Proceedings of Interspeech, Florence, Italy, pages 537-540, 2011 |
![]() |
J
Juicer: A Weighted Finite-State Transducer speech decoder, , , , , and , in: 3rd Joint Workshop on Multimodal Interaction and Related Machine LEarning Algorithms MLMI'06, 2006 |
![]() |
Juicer: A Weighted Finite-State Transducer speech decoder, , , , , and , Idiap-RR-21-2006 |
![]() |
L
Language dependent universal phoneme posterior estimation for mixed language speech recognition, , , and , Idiap-RR-13-2011 |
![]() |
Language dependent universal phoneme posterior estimation for mixed language speech recognition, , , and , in: Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Prag, CZ, pages 5012-5015, 2011 |
![]() |
M
Maximum kurtosis beamforming with the generalized sidelobe canceller, , , , , and , in: Proceedings of INTERSPEECH, September 2008, Brisbane, Australia, 2008 |
![]() |
Measuring the gap between HMM-based ASR and TTS, , and , Idiap-RR-16-2009 |
![]() |
Measuring the gap between HMM-based ASR and TTS, , and , in: Proceedings of Interspeech, Brighton, U.K., 2009 |
![]() |
Measuring the gap between HMM-based ASR and TTS, , and , in: IEEE Journal of Selected Topics in Signal Processing, in print, 2010 |
![]() |
Measuring the gap between HMM-based ASR and TTS, , and , Idiap-RR-34-2010 |
![]() |
MLP-based Log Spectral Energy Mapping for Robust Overlapping Speech Recognition, , , and , Idiap-RR-54-2007 |
![]() |
N
Neural Network based Regression for Robust Overlapping Speech Recognition using Microphone Arrays, , , and , Idiap-RR-09-2008 |
![]() |
Non-linear mapping for multi-channel speech separation and robust overlapping speech recognition, , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 |
![]() |
O
On the Use of Information Retrieval Measures for Speech Recognition Evaluation, , , , , , and , Idiap-RR-73-2004 |
![]() |
P
Personalising speech-to-speech translation in the EMIME project, , , , , , , , , , , , , , , , , , and , in: Proceedings of the ACL 2010 System Demonstrations, Association for Computational Linguistics, Uppsala, Sweden, 2010 |
[URL] |
Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis, , , , , , , , , , , , and , in: Computer Speech and Language, 2011 |
![]() [DOI] [URL] |
Phoneme vs Grapheme Based Automatic Speech Recognition, , , and , Idiap-RR-48-2004 |
![]() |
Phonological Knowledge Guided HMM State Mapping for Cross-Lingual Speaker Adaptation, and , in: Proceedings of Interspeech, Florence, Italy, 2011 |
![]() |
Phonological Knowledge Guided HMM State Mapping for Cross-Lingual Speaker Adaptation, and , Idiap-RR-17-2011 |
![]() |
| 1 | 2 |