WEKO3
-
RootNode
アイテム
A joint local spatial and global temporal CNN-Transformer for dynamic facial expression recognition
https://tokushima-u.repo.nii.ac.jp/records/2012042
https://tokushima-u.repo.nii.ac.jp/records/2012042181846aa-1c1f-4b03-8c06-a071569e4f35
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]()
Download is available from 2026/5/9.
|
|
Item type | 文献 / Documents(1) | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2024-05-21 | |||||||||||||||||||||
アクセス権 | ||||||||||||||||||||||
アクセス権 | embargoed access | |||||||||||||||||||||
資源タイプ | ||||||||||||||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||||||||||||||
資源タイプ | journal article | |||||||||||||||||||||
出版社版DOI | ||||||||||||||||||||||
関連識別子 | https://doi.org/10.1016/j.asoc.2024.111680 | |||||||||||||||||||||
関連名称 | 10.1016/j.asoc.2024.111680 | |||||||||||||||||||||
出版タイプ | ||||||||||||||||||||||
出版タイプ | AM | |||||||||||||||||||||
出版タイプResource | http://purl.org/coar/version/c_ab4af688f83e57aa | |||||||||||||||||||||
タイトル | ||||||||||||||||||||||
タイトル | A joint local spatial and global temporal CNN-Transformer for dynamic facial expression recognition | |||||||||||||||||||||
著者 |
Wang, Linhuang
× Wang, Linhuang
× 康, 鑫× Ding, Fei
× ナカガワ, サトシ
× 任, 福継
WEKO
401
|
|||||||||||||||||||||
抄録 | ||||||||||||||||||||||
内容記述 | Unlike conventional video action recognition, Dynamic Facial Expression Recognition (DFER) tasks exhibit minimal spatial movement of objects. Addressing this distinctive attribute, we propose an innovative CNN-Transformer model, named LSGTNet, specifically tailored for DFER tasks. Our LSGTNet comprises three stages, each composed of a spatial CNN (Spa-CNN) and a temporal transformer (T-Former) in sequential order. The Spa-CNN extracts spatial features from images, yielding smaller-sized feature maps to alleviate the computational complexity for subsequent T-Former. The T-Former integrates global temporal information from the same spatial positions across different time frames while retaining the feature map dimensions. The alternating interplay between Spa-CNN and T-Former ensures a continuous fusion of spatial and temporal information, leading our model to excel across various real-world datasets. To the best of our knowledge, this is the first method to address the DFER challenge by focusing on capturing the temporal changes in muscles within local spatial regions. Our method has achieved state-of-the-art results on multiple in-the-wild datasets and datasets under laboratory conditions. | |||||||||||||||||||||
キーワード | ||||||||||||||||||||||
主題 | Dynamic facial expression recognition | |||||||||||||||||||||
キーワード | ||||||||||||||||||||||
主題 | Affective computing | |||||||||||||||||||||
キーワード | ||||||||||||||||||||||
主題 | Transformer | |||||||||||||||||||||
キーワード | ||||||||||||||||||||||
主題 | Convolution neural network | |||||||||||||||||||||
書誌情報 |
en : Applied Soft Computing 巻 161, p. 111680, 発行日 2024-05-09 |
|||||||||||||||||||||
収録物ID | ||||||||||||||||||||||
収録物識別子タイプ | ISSN | |||||||||||||||||||||
収録物識別子 | 15684946 | |||||||||||||||||||||
収録物ID | ||||||||||||||||||||||
収録物識別子タイプ | ISSN | |||||||||||||||||||||
収録物識別子 | 18729681 | |||||||||||||||||||||
収録物ID | ||||||||||||||||||||||
収録物識別子タイプ | NCID | |||||||||||||||||||||
収録物識別子 | AA11644645 | |||||||||||||||||||||
収録物ID | ||||||||||||||||||||||
収録物識別子タイプ | NCID | |||||||||||||||||||||
収録物識別子 | AA11926126 | |||||||||||||||||||||
出版者 | ||||||||||||||||||||||
出版者 | Elsevier | |||||||||||||||||||||
権利情報 | ||||||||||||||||||||||
権利情報 | © 2024. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/ |
|||||||||||||||||||||
EID | ||||||||||||||||||||||
識別子 | 408424 | |||||||||||||||||||||
言語 | ||||||||||||||||||||||
言語 | eng |