REPORT grangier:2004:idiap-04-82/IDIAP Effect of Recognition Errors on Text Clustering Grangier, David Vinciarelli, Alessandro EXTERNAL http://publications.idiap.ch/attachments/reports/2004/rr04-82.pdf PUBLIC Idiap-RR-82-2004 2004 IDIAP This paper presents clustering experiments performed over noisy texts (i.e. texts that have been extracted through an automatic process like character or speech recognition). The effect of recognition errors is investigated by comparing clustering results performed over both clean (manually typed data) and noisy (automatic speech transcriptions) versions of the same speech recording corpus.