WEKO3
アイテム
End-to-end recognition of streaming Japanese speech using CTC and local attention
https://tokushima-u.repo.nii.ac.jp/records/2008800
https://tokushima-u.repo.nii.ac.jp/records/2008800349e8068-9ba8-46d0-a0f4-4217819737c8
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
Item type | 文献 / Documents(1) | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2021-06-11 | |||||||||||||
アクセス権 | ||||||||||||||
アクセス権 | open access | |||||||||||||
資源タイプ | ||||||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||||
資源タイプ | journal article | |||||||||||||
出版社版DOI | ||||||||||||||
識別子タイプ | DOI | |||||||||||||
関連識別子 | https://doi.org/10.1017/ATSIP.2020.23 | |||||||||||||
言語 | ja | |||||||||||||
関連名称 | 10.1017/ATSIP.2020.23 | |||||||||||||
出版タイプ | ||||||||||||||
出版タイプ | VoR | |||||||||||||
出版タイプResource | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |||||||||||||
タイトル | ||||||||||||||
タイトル | End-to-end recognition of streaming Japanese speech using CTC and local attention | |||||||||||||
言語 | en | |||||||||||||
タイトル別表記 | ||||||||||||||
その他のタイトル | E2E SPEECH RECOGNITION WITH CTC AND LOCAL ATTENTION | |||||||||||||
言語 | en | |||||||||||||
著者 |
Chen, Jiahao
× Chen, Jiahao
× 西村, 良太
WEKO
942
× 北岡, 教英 |
|||||||||||||
抄録 | ||||||||||||||
内容記述タイプ | Abstract | |||||||||||||
内容記述 | Many end-to-end, large vocabulary, continuous speech recognition systems are now able to achieve better speech recognition performance than conventional systems. Most of these approaches are based on bidirectional networks and sequence-to-sequence modeling however, so automatic speech recognition (ASR) systems using such techniques need to wait for an entire segment of voice input to be entered before they can begin processing the data, resulting in a lengthy time-lag, which can be a serious drawback in some applications. An obvious solution to this problem is to develop a speech recognition algorithm capable of processing streaming data. Therefore, in this paper we explore the possibility of a streaming, online, ASR system for Japanese using a model based on unidirectional LSTMs trained using connectionist temporal classification (CTC) criteria, with local attention. Such an approach has not been well investigated for use with Japanese, as most Japanese-language ASR systems employ bidirectional networks. The best result for our proposed system during experimental evaluation was a character error rate of 9.87%. | |||||||||||||
言語 | en | |||||||||||||
キーワード | ||||||||||||||
言語 | en | |||||||||||||
主題Scheme | Other | |||||||||||||
主題 | CTC | |||||||||||||
キーワード | ||||||||||||||
言語 | en | |||||||||||||
主題Scheme | Other | |||||||||||||
主題 | Local attention | |||||||||||||
キーワード | ||||||||||||||
言語 | en | |||||||||||||
主題Scheme | Other | |||||||||||||
主題 | Speech recognition | |||||||||||||
キーワード | ||||||||||||||
言語 | en | |||||||||||||
主題Scheme | Other | |||||||||||||
主題 | Streaming recognition | |||||||||||||
書誌情報 |
en : APSIPA Transactions on Signal and Information Processing 巻 9, p. e25, 発行日 2020-11-23 |
|||||||||||||
収録物ID | ||||||||||||||
収録物識別子タイプ | ISSN | |||||||||||||
収録物識別子 | 20487703 | |||||||||||||
出版者 | ||||||||||||||
出版者 | Cambridge University Press | |||||||||||||
言語 | en | |||||||||||||
権利情報 | ||||||||||||||
言語 | en | |||||||||||||
権利情報 | This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited. | |||||||||||||
EID | ||||||||||||||
識別子 | 372885 | |||||||||||||
識別子タイプ | URI | |||||||||||||
言語 | ||||||||||||||
言語 | eng |