Deep Neural Networks for Voice Quality Assessment based on the GRBAS Scale | |
Simin Xie; Nan Yan; Ping Yu; Manwa L. Ng; Lan Wang; Zhuanzhuan Ji | |
2016 | |
会议名称 | Interspeech 2016 |
会议地点 | 美国旧金山 |
英文摘要 | In the field of speech therapy, perceptual evaluation is widely used by expert listeners as a way to evaluate a pathological and normal voice quality. This approach is understandably subjective as it is subject to listeners’ bias which high inter- and intra-listener variability can be found. As such, research on automatic assessment of pathological voices using a combination of subjective and objective analyses. The present study aimed to develop a complementary automatic assessment system for voice quality based on the well-known GRBAS scale by using an array of multidimensional acoustical measures through Deep Neural Networks. A total of 44 dimensionality measures including Mel Frequency Cepstral Coefficients, Smoothed Cepstral Peak Prominence and Long-Term Average Spectrum was adopted. In addition, the state-of-the-art automatic assessment system based on Modulation Spectrum (MS) features and GMM classifiers was used as comparison system. The classification results using the proposed method revealed a moderate correlation with subjective GRBAS scores of dysphonic severity, and yielded a better performance than the MS-GMM system, with the best accuracy around 81.53%. The findings indicate that such assessment system can be used as an appropriate evaluation tool in determining the presence and severity of voice disorders. |
收录类别 | EI |
语种 | 英语 |
内容类型 | 会议论文 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/10040] |
专题 | 深圳先进技术研究院_集成所 |
作者单位 | 2016 |
推荐引用方式 GB/T 7714 | Simin Xie,Nan Yan,Ping Yu,et al. Deep Neural Networks for Voice Quality Assessment based on the GRBAS Scale[C]. 见:Interspeech 2016. 美国旧金山. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论