Speaker identification framework by peripheral and central auditory models

Masanori Morise, Kenji Ozawa

Research output: Contribution to journalArticlepeer-review

Abstract

A framework based on human auditory models for speaker identification was proposed. Preliminary evaluation using a very small database was carried out to determine suitable deep neural networks (DNNs) parameters and to evaluate the effectiveness of the proposed framework. A database including four speakers was used for the experiment. This database consists of isolated vowels recorded in the recording studio with a microphone. Using and unifying several frames would improve the performance, and the frame-by-frame evaluation with isolated vowels is therefore one of the most strict conditions. The number of hidden layers was not important compared with the number of units in each hidden layer. The results showed two hidden layers were enough, and more than three did not improve the performance. The result suggests that the type of domain is not important, provided that the DNNs were used as the classifier.

Original languageEnglish
Pages (from-to)340-343
Number of pages4
JournalAcoustical Science and Technology
Volume36
Issue number4
DOIs
Publication statusPublished - 2015

Keywords

  • Auditory model
  • Deep neural networks
  • Speaker identification
  • Speech analysis

Fingerprint

Dive into the research topics of 'Speaker identification framework by peripheral and central auditory models'. Together they form a unique fingerprint.

Cite this