ログイン
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 資料タイプ別
  2. 学術雑誌論文

End-to-end recognition of streaming Japanese speech using CTC and local attention

https://tokushima-u.repo.nii.ac.jp/records/2008800
https://tokushima-u.repo.nii.ac.jp/records/2008800
349e8068-9ba8-46d0-a0f4-4217819737c8
名前 / ファイル ライセンス アクション
atsip_9_e25.pdf atsip_9_e25.pdf (421 KB)
license.icon
Item type 文献 / Documents(1)
公開日 2021-06-11
アクセス権
アクセス権 open access
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_6501
資源タイプ journal article
出版社版DOI
関連識別子 https://doi.org/10.1017/ATSIP.2020.23
関連名称 10.1017/ATSIP.2020.23
出版タイプ
出版タイプ VoR
出版タイプResource http://purl.org/coar/version/c_970fb48d4fbd8a85
タイトル
タイトル End-to-end recognition of streaming Japanese speech using CTC and local attention
タイトル別表記
その他のタイトル E2E SPEECH RECOGNITION WITH CTC AND LOCAL ATTENTION
著者 Chen, Jiahao

× Chen, Jiahao

en Chen, Jiahao

Search repository
西村, 良太

× 西村, 良太

WEKO 942

ja 西村, 良太
ISNI

ja-Kana ニシムラ, リョウタ

en Nishimura, Ryota

Search repository
北岡, 教英

× 北岡, 教英

WEKO 728
e-Rad 10333501

ja 北岡, 教英
ISNI

ja-Kana キタオカ, ノリヒデ

en Kitaoka, Norihide

Search repository
抄録
内容記述 Many end-to-end, large vocabulary, continuous speech recognition systems are now able to achieve better speech recognition performance than conventional systems. Most of these approaches are based on bidirectional networks and sequence-to-sequence modeling however, so automatic speech recognition (ASR) systems using such techniques need to wait for an entire segment of voice input to be entered before they can begin processing the data, resulting in a lengthy time-lag, which can be a serious drawback in some applications. An obvious solution to this problem is to develop a speech recognition algorithm capable of processing streaming data. Therefore, in this paper we explore the possibility of a streaming, online, ASR system for Japanese using a model based on unidirectional LSTMs trained using connectionist temporal classification (CTC) criteria, with local attention. Such an approach has not been well investigated for use with Japanese, as most Japanese-language ASR systems employ bidirectional networks. The best result for our proposed system during experimental evaluation was a character error rate of 9.87%.
キーワード
主題 CTC
キーワード
主題 Local attention
キーワード
主題 Speech recognition
キーワード
主題 Streaming recognition
書誌情報 en : APSIPA Transactions on Signal and Information Processing

巻 9, p. e25, 発行日 2020-11-23
収録物ID
収録物識別子タイプ ISSN
収録物識別子 20487703
出版者
出版者 Cambridge University Press
権利情報
権利情報 This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
EID
識別子 372885
言語
言語 eng
戻る
0
views
See details
Views

Versions

Ver.1 2024-11-22 07:44:53.654908
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3