Different methods of extracting speech features from an auditory model were systematically investigated in terms of their robustness to different noises. The methods either computed the average firing rate within frequency channels (spectral features) or inter-spike-intervals (timing features) from the simulated auditory nerve response. When used as the front-end for an automatic speech recognizer, timing features outperformed spectral features in Gaussian noise. However, this advantage was lost in babble, because timing features extracted the spectro-temporal structure of babble noise, which is similar to the target speaker. This suggests that different feature extraction methods are optimal depending on the background noise.

译文

:系统地研究了从听觉模型中提取语音特征的不同方法,它们对不同噪声的鲁棒性强。这些方法或者从模拟听觉神经反应中计算出频道(频谱特征)或尖峰间隔(定时特征)内的平均发声率。当用作自动语音识别器的前端时,时序特征在高斯噪声中的表现优于频谱特征。但是,由于时序特征提取了与目标说话者相似的胡言乱语的频谱时态结构,因此在胡言乱语中失去了这一优势。这表明根据背景噪声,不同的特征提取方法是最佳的。

+1
+2
100研值 100研值 ¥99课程
检索文献一次
下载文献一次

去下载>

成功解锁2个技能,为你点赞

《SCI写作十大必备语法》
解决你的SCI语法难题!

技能熟练度+1

视频课《玩转文献检索》
让你成为检索达人!

恭喜完成新手挑战

手机微信扫一扫,添加好友领取

免费领《Endnote文献管理工具+教程》

微信扫码, 免费领取

手机登录

获取验证码
登录