Harvest: A high-performance fundamental frequency estimator from speech signals

研究成果: Conference article査読

24 被引用数 (Scopus)

抄録

A fundamental frequency (F0) estimator named Harvest is described. The unique points of Harvest are that it can obtain a reliable F0 contour and reduce the error that the voiced section is wrongly identified as the unvoiced section. It consists of two steps: estimation of F0 candidates and generation of a reliable F0 contour on the basis of these candidates. In the first step, the algorithm uses fundamental component extraction by many band-pass filters with different center frequencies and obtains the basic F0 candidates from filtered signals. After that, basic F0 candidates are refined and scored by using the instantaneous frequency, and then several F0 candidates in each frame are estimated. Since the frame-by-frame processing based on the fundamental component extraction is not robust against temporally local noise, a connection algorithm using neighboring F0s is used in the second step. The connection takes advantage of the fact that the F0 contour does not precipitously change in a short interval. We carried out an evaluation using two speech databases with electroglottograph (EGG) signals to compare Harvest with several state-of-the-art algorithms. Results showed that Harvest achieved the best performance of all algorithms.

本文言語English
ページ(範囲)2321-2325
ページ数5
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2017-August
DOI
出版ステータスPublished - 2017
イベント18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 - Stockholm, Sweden
継続期間: 20 8月 201724 8月 2017

フィンガープリント

「Harvest: A high-performance fundamental frequency estimator from speech signals」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル