Item type |
文献 / Documents(1) |
公開日 |
2021-02-08 |
アクセス権 |
|
|
アクセス権 |
open access |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_6501 |
|
資源タイプ |
journal article |
出版社版DOI |
|
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.1177/1729881417719836 |
|
|
言語 |
ja |
|
|
関連名称 |
10.1177/1729881417719836 |
出版タイプ |
|
|
出版タイプ |
VoR |
|
出版タイプResource |
http://purl.org/coar/version/c_970fb48d4fbd8a85 |
タイトル |
|
|
タイトル |
A combined cepstral distance method for emotional speech recognition |
|
言語 |
en |
著者 |
Quan, Changqin
Zhang, Bin
Sun, Xiao
任, 福継
|
抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
Affective computing is not only the direction of reform in artificial intelligence but also exemplification of the advanced intelligent machines. Emotion is the biggest difference between human and machine. If the machine behaves with emotion, then the machine will be accepted by more people. Voice is the most natural and can be easily understood and accepted manner in daily communication. The recognition of emotional voice is an important field of artificial intelligence. However, in recognition of emotions, there often exists the phenomenon that two emotions are particularly vulnerable to confusion. This article presents a combined cepstral distance method in two-group multi-class emotion classification for emotional speech recognition. Cepstral distance combined with speech energy is well used as speech signal endpoint detection in speech recognition. In this work, the use of cepstral distance aims to measure the similarity between frames in emotional signals and in neutral signals. These features are input for directed acyclic graph support vector machine classification. Finally, a two-group classification strategy is adopted to solve confusion in multi-emotion recognition. In the experiments, Chinese mandarin emotion database is used and a large training set (1134 + 378 utterances) ensures a powerful modelling capability for predicting emotion. The experimental results show that cepstral distance increases the recognition rate of emotion sad and can balance the recognition results with eliminating the over fitting. And for the German corpus Berlin emotional speech database, the recognition rate between sad and boring, which are very difficult to distinguish, is up to 95.45%. |
|
言語 |
en |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
Cepstral distance |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
emotional speech recognition |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
two-group classification |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
principal component analysis |
書誌情報 |
en : International Journal of Advanced Robotic Systems
巻 14,
号 4,
発行日 2017-07-10
|
収録物ID |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
17298814 |
出版者 |
|
|
出版者 |
SAGE Publications |
|
言語 |
en |
権利情報 |
|
|
言語 |
en |
|
権利情報 |
This article is distributed under the terms of the Creative Commons Attribution 4.0 License(http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
EID |
|
|
識別子 |
325633 |
|
識別子タイプ |
URI |
言語 |
|
|
言語 |
eng |