Post hoc implementation of non-standard phonetic features in the context of aphasic speech analysis
DOI:
https://doi.org/10.21248/jlcl.38.2025.251Keywords:
Aphasia, Speech Therapy, Automatic Speech Recognition, Thuringian-Upper Saxon dialectAbstract
Despite current progress, automatic speech recognition (ASR) often struggles with non-standard speech, for example, influenced by dialectal or pathological features. (Re)training ASR models to accommodate these variations is not always possible due to limited data. This paper proposes applying the knowledge about non-standard (aphasic and dialectal) phonetic features to the ASR transcription post hoc. Using speech data from German speakers with aphasia who speak the Thuringian-Upper Saxon dialect, this study evaluates the impact of these modifications on an ASR-based error analysis pipeline. The approach helps to reduce automatic error rates on the recordings manually labelled as error-free. The performance of the pipeline also improves both in general acceptance or rejection of the responses and error attribution. General acceptance/rejection accuracy reaches the mean of 83.3%, which is considered sufficient to be used in a digital application for speech and language therapy support.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Eugenia Rykova, Elisabeth Zeuner, Susanne Voigt-Zimmermann, Mathias Walther

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.