Hacker M, Nöth E (2014)
Publication Type: Conference contribution, Conference Contribution
Publication year: 2014
Publisher: Institute of Electrical and Electronics Engineers Inc.
Edited Volumes: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages Range: 2317-2321
Conference Proceedings Title: Proceedings of ICASSP 2014
Event location: Florenz
DOI: 10.1109/ICASSP.2014.6854013
We present a new method to augment the correct transcript from automatic speech recognition (ASR) output containing multiple hypotheses. The error-prone ASR process is taken as black box and modeled as a noisy channel on phoneme level. The probabilities of the individual phoneme errors are assigned according to phonetic confusability. We score potential candidate hypotheses by their posterior probability of being the channel input given the competing ASR hypotheses as observed output. The resulting scores provide useful information not included in traditional confidence measures. We investigated the usefulness of the method for rescoring, re-ranking and word error detection. The method alone is not powerful enough to improve the recognition results, but by employing a decision tree classifier it is possible to isolate cases where the method works very well. Our results show that the combination with other knowledge sources and postprocessing techniques can lead to promising improvements. © 2014 IEEE.
APA:
Hacker, M., & Nöth, E. (2014). A Phonetic Similarity Based Noisy Channel Approach to ASR Hypothesis Re-Ranking and Error Detection. In Proceedings of ICASSP 2014 (pp. 2317-2321). Florenz: Institute of Electrical and Electronics Engineers Inc..
MLA:
Hacker, Martin, and Elmar Nöth. "A Phonetic Similarity Based Noisy Channel Approach to ASR Hypothesis Re-Ranking and Error Detection." Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florenz Institute of Electrical and Electronics Engineers Inc., 2014. 2317-2321.
BibTeX: Download