[1]ZHONG Qiubo,ZHENG Caiming,PIAO Songhao.Research on skeleton-based action recognition with spatiotemporal fusion and human–robot interaction[J].CAAI Transactions on Intelligent Systems,2020,15(3):601-608.[doi:10.11992/tis.202006029]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
15
Number of periods:
2020 3
Page number:
601-608
Column:
人工智能院长论坛
Public date:
2020-05-05
- Title:
-
Research on skeleton-based action recognition with spatiotemporal fusion and human–robot interaction
- Author(s):
-
ZHONG Qiubo1; 2; ZHENG Caiming1; PIAO Songhao3
-
1. Robotics Institute, Ningbo University of Technology, Ningbo 315211, China;
2. State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin 150001, China;
3. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
-
- Keywords:
-
action recognition; temporal and spatial relationships; posture motion; spatiotemporal fusion; graph convolution network; temporal attention; adaptive feature enhancement; human–robot interaction
- CLC:
-
TP312
- DOI:
-
10.11992/tis.202006029
- Abstract:
-
Temporal dynamics of postures over time is crucial for sequence-based action recognition. Human actions can be represented by corresponding motions of an articulated skeleton. Skeleton-based action recognition algorithm is used for studying motions of a body. Skeleton-based action recognition uses many methods, and research shows that most of them extract spatial and motion information separately from a skeleton structure and then combine them for further processing. However, this process is not able to efficiently deliver human motion features with complex temporal and spatial relationships. We propose a novel posture motion-based, spatiotemporal fused graph convolution network for skeleton-based action recognition. First, we define a local posture motion-based time attention module, which is used to constrain the disturbance information in temporal domain and learn the representation of motion posture features. Then, we design a posture motion-based, spatiotemporal fusion module. This module fuses spatial motion and temporal attitude features and adaptively enhances the skeleton joint features. Extensive experiments have been performed and the results verified the effectiveness of our proposed method. The proposed method has competitive performance, and it is concluded that the human–robot interaction system based on action recognition is superior to the speech interaction system in real-time and with respect to accuracy.