An End-to-End Text-Independent Speaker Identification System on Short Utterances
Ji RF(吉瑞芳)1,2; Cai XY(蔡新元)1; Xu B(徐波)1
2018-09
会议日期2018-9-2——2018-9-6
会议地点印度,海得拉巴
英文摘要

In the field of speaker recognition, text-independent speaker identification on short utterances is still a challenging task, since it is rather tough to extract a robust and dicriminative speaker feature in short duration condition. This paper explores an end-to-end speaker identification system, which maps utterances to a speaker identity subspace where the similarity of speakers can be measured by Euclidean distance. To be specific, we apply GRU architectures to extract utterance-level feature. Then it is assumed that one’s various utterances can be viewed as transformations of a single object in an ideal speaker identity subspace. Based on this assumption, the ResCNN architecture is utilized to model the transformation, and the whole system is jointly optimized by speaker identity subspace loss. Experimental results demonstrate the effectiveness of our proposed system and superiority over pervious methods. For example, the GRU learned feature reduces the equal error rate by 27.53% relatively and the speaker identity subspace loss further brings 7.22% relative reduction compared to softmax loss.

内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/23545]  
专题数字内容技术与服务研究中心_听觉模型与认知计算
作者单位1.中科院自动化研究所
2.中国科学院大学
推荐引用方式
GB/T 7714
Ji RF,Cai XY,Xu B. An End-to-End Text-Independent Speaker Identification System on Short Utterances[C]. 见:. 印度,海得拉巴. 2018-9-2——2018-9-6.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace