子空间语音增强算法研究

CORC > 声学研究所 > 中国科学院声学所 > 声学所博硕士学位论文 > 1981-2009博硕士学位论文

题名	子空间语音增强算法研究
作者	钟维保
学位类别	博士
答辩日期	2005
授予单位	中国科学院声学研究所
授予地点	中国科学院声学研究所
关键词	语音增强子空间 KL变换特征值／奇异值分解投影逼近子空间足良踪非满秩噪声
其他题名	Research on Signal Subspace Speech Enhancemen Algorithm
中文摘要	语音增强是在噪声环境下用以提高语音通信系统质量的一个重要技术。人们提出了许多语音增强的处理方法，这些方法当中最为成功的当数谱减法和维纳滤波法。但是这些方法不可避免的引入了一些人为噪声和失真。近年来，基于子空间的语音增强技术受到许多研究者的重视，该方法可减少信号的失真和人为噪声的引入。子空间技术将带噪语音信号看成向量空间的一部分，并将此向量空间划分成两个相互正交的子空间：信号子空间和噪声子空间。去除噪声子空间的信号分量可以提高带噪信号的语音质量，进一步从信号子空间中估计出高质量的语音信号。EPhxaimandVan-Trees提出了一套有效的子空间语音增强系统，利用特征值分解（EVD）和IcL变换分解来进行信号空间的划分，并针对白噪声背景下的带噪语音，提出了有效的时域和频域的线性估计算法。后来的学者将该方法扩展到对于含有色噪声的语音增强处理上。Rezayee和Gazor基于噪声能量谱的近似对角化提出了一种时域的次优估计算法；Hu和Loizou提出联合对角化的方法来处理有色噪声；Lev-Aii和EPhraim利用预白化方法扩展了其早期的算法来进行有色噪声背景下的语音增强。基于Rezayee和Gazor提出的自适应KLr的语音增强算法，并通过扩展yhng提出的投影逼近子空间跟踪算法，本文提出了一种联合投影逼近子空间跟踪的语音增强算法。该方法以递推更新的方式得到语音信号和噪声信号协方差矩阵同时对角化的特征向量和特征值，以实现语音信号的最优估计。在前面提及的子空间语音增强算法中，噪声协方差矩阵都已假定为正定矩阵。然而实际应用中这一条件并不总满足，比如窄带噪声干扰的情况。本文提出一种可扩展到非满秩噪声干扰的子空间增强算法。依据噪声协方差矩阵而非带噪信号协方差矩阵将向量空间划分成信号加噪声子空间和非噪声子空间，因而这样的分解得到的两个相互正交的子空间不同于前述的两个子空间。该方法将原来窄带噪声问题转化为信号加噪声子空间的宽带噪声问题来处理，并利用预白化的方法来提取信号加噪声子空间的语音信号，加上非噪声子空间的有用信号从而重建出原始语音信号。实验比较和主观测听表明，上述的方法可以有效的抑制背景噪声，提高语音质量。
英文摘要	Speech enhancement is an important technique to improve the performance of speech communication systems in noisy environments. In the last ten years, the signal subspace (SS) approach for speech enhancement has attracted a great deal of research efforts. The SS method decomposes the noisy speech vector space into two orthogonal subspaces, called the signal subspace and the noise subspace. To improve speech quality, the noise components are discarded. If decomposition were performed correctly, the amount of noise in the speech signal would be reduced without introducing distortion. Thus the speech signal can be further processed for better quality. Ephraim and Van-Trees (EVT) first introduced an efficient SS speech enhancement system, in which KLT and EVD were employed to decompose the noisy speech signals into two uncorrelated components. Then optimal linear estimators were developed based on two perceptually meaningful estimation criteria. Extensions of this technique to colored noise were proposed subsequently. Rezayee and Gazor developed a time domain constraint estimator using a diagonal matrix for colored-noise power spectrum. This results in a sub-optimal estimator. Hu and Loizou presented a time-domain constraint estimator based on the joint diagonalization of the covariance matrices of the clean signal and the noise process. Lev-Ari and Ephraim extended EVT subspace approach to colored-noise processes in the time and spectral domains using whitening of the input noise. In this thesis, we propose a speech-enhancement algorithm based on the United Projection Approximation Subspace Tracking Algorithm (UPAST), which simultaneously diagonalizes speech and noise covariance matrices. This method is based on the adaptive KLT algorithm proposed by Rezayee and Gazor and the Projection Approximation Subspace Tracking (PAST) approach proposed by Yang. Results show that the proposed UPAST algorithm is able to efficiently attain the optimal estimation of speech corrupted by colored noise. Other contribution of the thesis is the development of a speech enhancement algorithm in the presence of narrowband noise. The noise covariance matrix in the existing subspace methods is assumed to be positive definite so that the prewhitening step can be successfully performed. However, there are applications where this requirement may not be met. For example, in the case of narrowband noise, the noise covariance matrix is rank deficient. Based on the eigenvalue decomposition of the rank deficient noise covariance matrix, we show how to formulate the enhancement algorithm by decomposing the signal subspace into a signal-plus-noise subspace and a noise-free subspace. The proposed subspace partition is different from previous subspace methods. Then we implement the noise reduction algorithm using the whitening approach exclusively in the signal-plus-noise subspace. The clean speech can be estimated by combining the filtered components in the signal-plus-noise subspace with components in the noise-free subspace. Thus the noise can be effectively alleviated, and the speech quality is improved. Examples are given to illustrate the merits of the proposed subspace method.
语种	中文
公开日期	2011-05-07
页码	62
内容类型	学位论文
源URL	[http://159.226.59.140/handle/311008/980]
专题	声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式 GB/T 7714	钟维保. 子空间语音增强算法研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2005.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们