CONF
vin03d-art/IDIAP
Noisy Text Categorization
Vinciarelli, Alessandro
EXTERNAL
https://publications.idiap.ch/attachments/reports/2003/rr03-61.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/vincia03d
Related documents
Proceedings of International Conference on Pattern Recognition (ICPR)
2004
554-557
IDIAP-RR 03-61
This work presents a system for the categorization of noisy texts. By noisy it is meant any text obtained through an extraction process (affected by errors) from media different than digital texts. We show that, even with an average Word Error Rate of around 50%, the categorization performance loss with respect to the clean version of the same documents is negligible.
REPORT
vincia03d/IDIAP
Noisy Text Categorization
Vinciarelli, Alessandro
EXTERNAL
https://publications.idiap.ch/attachments/reports/2003/rr03-61.pdf
PUBLIC
Idiap-RR-61-2003
2003
IDIAP
This work presents a system for the categorization of noisy texts. By noisy it is meant any text obtained through an extraction process (affected by errors) from media different than digital texts. We show that, even with an average Word Error Rate of around 50%, the categorization performance loss with respect to the clean version of the same documents is negligible.