人机交互应用中麦克风阵列语音增强的研究

CORC > 声学研究所 > 中国科学院声学所 > 声学所博硕士学位论文 > 1981-2009博硕士学位论文

题名	人机交互应用中麦克风阵列语音增强的研究
作者	何成林
学位类别	博士
答辩日期	2005
授予单位	中国科学院声学研究所
授予地点	中国科学院声学研究所
关键词	人机交互麦克风阵列语音增强语音检测阵列信号处理集成维纳滤波后滤波
其他题名	Research on Microphone Array Speech Enhancement for Human-machine Interaction
中文摘要	在人机语音交互的实际应用中，计算机处理的实际信号除了包含目标语音信号以外，还常常包含噪声信号或干扰语音信号或噪声信号与干扰语音信号二者兼有，导致语音识别的可用性急剧降低。本论文开展如何根据人机语音交互的实际应用场景进行语音识别的前端处理研究，使目标语音信号相对于噪声信号和干扰语音信号得到增强，以便改善语音识别在实际应用场景中的可用性。本论文的创新工作如下：1、系统地分析了各种基本的麦克风阵列语音增强技术的消噪性能，包括经典的延迟相加波束形成器、自适应波束形成器、后滤波技术等；并对一些最新的麦克风阵列语音增强算法进行了分析，如近场超定向波束形成器、广义奇异值分解结构、传输函数广义旁瓣相消器等，归纳了这些算法和结构的特点及其在实际应用中的局限性。2、针对人机语音交互实际应用中目标声源和干扰声源的空间分布特性，提出了一种结合维纳后滤波及空间滤波的麦克风阵列语音检测方法，较好地解决了低信噪比和存在干扰语音时的语音检测问题，当目标声源和干扰声源的位置固定，或其位置存在一定的相对移动时，对于信噪比为-5dB、干扰噪声比为-5dB的阵列接收信号，该语音检测算法对目标语音和干扰语音的检测结果正确率分别为87.3％和82.2％，对于干扰语音和目标语音同时存在的情况（SNR＝0dB，SIR＝-5dB），语音检测结果正确率为89.9％。3、提出了一种集成维纳滤波的稳健麦克风阵列语音增强结构（RGSC-IW），通过构建一个有效的自适应模式控制器（AMO来控制广义旁瓣相消器（GSC）的自适应，实验结果表明，当目标声源和干扰声源的位置固定或存在一定的相对移动时，RGSC-IW能够取得与人工自适应广义旁瓣相消器维纳后滤波结构（GSC-PW）相当的噪声抵消量和干扰抵消量，且RGSC-IW增强之后的语音信号失真度更小。
英文摘要	In the practical applications of human-machine speech interaction, the signal received by the computer comprises not only target speech but also noise and interfering speech, which severely degrades the usability of speech recognition. Based on the practical scene of human-machine speech interaction, this thesis focuses on the research of the front-end processing of speech recognition, i.e. enhancing the target speech and suppressing the interfering speech and the noise, for improving the performance of the speech recognizer. The innovative characteristics of this thesis include three aspects: K Some basic techniques for microphone array speech enhancement, such as Delay-and-Sum Beamformer, Adaptive Beamformer and Post-filter are explained, and their noise reduction properties are also analyzed. The mechanisms of some most advanced technologies for microphone array speech enhancement, such as Near-field Superdirectivity Beamformer, Generalized Singular Value Decomposition and Transform Function Generalized Sidelobe Canceller are also presented. The properties of these algorithms and the appropriate acoustic environments for using them are also analyzed. 2> Based on the position characteristics of the target and interferefering speaker, a new speech detector based on Wiener post-filter and space filter is proposed, which is suitable for speech detection when the signal-to-noise ratio is low and interferefering speech exits. When the target and interferefering speakers are fixed or they are moving in a small areas, the detector can detect the target speech at an accuracy of 87.3% (SNR=5-dB) and 82.2% (INR=-5dB) for the interferefering speech, and the accuracy is 89.9% when the target and inteTferefering speech burst at the same time (SNR=0dB, SIR=5dB). 3^ A new structure (RGSC-IW) for microphone array speech enhancement is proposed, in which an efficient adaptive mode controller (AMC) is constructed to control the adaptation of GSC. When the target and interferefering speakers are fixed or they are moving in a small areas, practical tests show that the RGSC-IW can achieve nearly the same ability of noise and interference reduction with the technique of combine GSC with post Wiener filter (GSC-PW) whose adaptation is realized manually, and the enhanced speech's distortion of the former is smaller than that of the later.
语种	中文
公开日期	2011-05-07
页码	108
内容类型	学位论文
源URL	[http://159.226.59.140/handle/311008/1054]
专题	声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式 GB/T 7714	何成林. 人机交互应用中麦克风阵列语音增强的研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2005.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们