Performance Evaluation of Automatic Speech Recognition for Impaired
Speech (Dysarthria) People Based on IDEA and CLIPS Data
Author : Shivansh Kulshrestha
Dealing with a speech impairment is a severe challenge for Automatic Speech Recognition (ASR) systems since dysarthria renders conventional methods ineffective. Specifically, this study aims to validate whether a unique speech analysis technique can be applied to dysarthria speakers. ASR system efficacy can be enhanced using a unique method that optimizes distinct spectral analysis parameters employed to determine the optimal regions. Further, it investigates the relationship between the speaker's vocal features and window-shift parameters that optimizes ASR systems for specific speakers to minimize error. The current study used a speech impairment dataset containing both normal and damaged vocals. The IDEA database contained 30 dysarthria speakers, whereas the CLIPS database contained 10 speaker vocals. These data exist in both databases and can be accessed freely. The dataset collected from these databases was initially checked for quality, and only the vocals of good quality were included in this study. The experimental analysis was performed.
The results demonstrated that when a vocal has dysarthria, a normal ASR system can perform better using speech analysis. Otherwise, the new method showed poor performance when speech is unimpaired or only minimally impaired. At last, the study findings indicated that a significant correlation existed between the vocal of the selected speaker and optimal parameters.