logo Idiap Research Institute        
 [BibTeX] [Marc21]
Expanded Lattice Embeddings for Spoken Document Retrieval on Informal Meetings
Type of publication: Conference paper
Citation: VILLATORO-TELLO_SIGIR'22_2022
Publication status: Accepted
Booktitle: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
Year: 2022
Month: July
Publisher: ACM
ISBN: 978-1-4503-8732-3/22/07
Crossref: VILLATORO-TELLO_Idiap-RR-06-2022:
DOI: 10.1145/3477495.3531921
Abstract: In this paper, we evaluate different alternatives to process richer forms of Automatic Speech Recognition (ASR) output based on lattice expansion algorithms for Spoken Document Retrieval (SDR). Typically, SDR systems employ ASR transcripts to index and retrieve relevant documents. However, ASR errors negatively affect the retrieval performance. Multiple alternative hypotheses can also be used to augment the input to document retrieval to compensate for the erroneous one-best hypothesis. In Weighted Finite State Transducer-based ASR systems, using the n-best output (i.e. the top ``n'' scoring hypotheses) for the retrieval task is common, since they can easily be fed to a traditional Information Retrieval (IR) pipeline. However, the n-best hypotheses are terribly redundant, and do not sufficiently encapsulate the richness of the ASR output, which is represented as an acyclic directed graph called the lattice. In particular, we utilize the lattice's constrained minimum path cover to generate a minimum set of hypotheses that serve as input to the reranking phase of IR. The novelty of our proposed approach is the incorporation of the lattice as an input for neural reranking by considering a set of hypotheses that represents every arc in the lattice. The obtained hypotheses are encoded through sentence embeddings using BERT-based models, namely SBERT and RoBERTa, and the final ranking of the retrieved segments is obtained with a max-pooling operation over the computed scores among the input query and the hypotheses set. We present our evaluation on the publicly available AMI meeting corpus. Our results indicate that the proposed use of hypotheses from the expanded lattice improves the SDR performance significantly over the $n$-best ASR output.
Keywords:
Projects Idiap
Authors VILLATORO-TELLO, Esaú
Madikeri, Srikanth
Motlicek, Petr
Ganapathiraju, Aravind
Ivanov, Alexei V.
Added by: [UNK]
Total mark: 0
Attachments
  • VILLATORO-TELLO_SIGIR22_2022.pdf
Notes