Speaker identification framework by peripheral and central auditory models

Masanori Morise, Kenji Ozawa

研究成果: Article査読

抄録

A framework based on human auditory models for speaker identification was proposed. Preliminary evaluation using a very small database was carried out to determine suitable deep neural networks (DNNs) parameters and to evaluate the effectiveness of the proposed framework. A database including four speakers was used for the experiment. This database consists of isolated vowels recorded in the recording studio with a microphone. Using and unifying several frames would improve the performance, and the frame-by-frame evaluation with isolated vowels is therefore one of the most strict conditions. The number of hidden layers was not important compared with the number of units in each hidden layer. The results showed two hidden layers were enough, and more than three did not improve the performance. The result suggests that the type of domain is not important, provided that the DNNs were used as the classifier.

本文言語English
ページ(範囲)340-343
ページ数4
ジャーナルAcoustical Science and Technology
36
4
DOI
出版ステータスPublished - 2015

フィンガープリント

「Speaker identification framework by peripheral and central auditory models」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル