AIMV Laboratory|Artificial Intelligence and Machine Vision

Recently, our school published a paper entitled "Part Aware Contrastive Learning for Self-Supervised Action Recognition" at the top conference on artificial intelligence in the computer field "International Joint Conference on Artificial Intelligence" (IJCAI, h5 index: 120, IF: 12.04) in cooperation with off-campus "Research paper, Hua Yilei, a master student of our school, is the first author, and the co-corresponding authors are Mr. Chen Chen from the Computer Vision Research Center of the University of Central Florida and Mr. Wu Shiqian from our school.

In recent years, remarkable results have been achieved in self-supervised action recognition using skeleton sequences with contrastive learning. It has been observed that the semantic distinction of human action features is often represented by local body parts, such as legs or hands, which are advantageous for skeleton-based action recognition. This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR, which integrates local similarity and global features for skeleton-based action representations. To achieve this, a multi-head attention mask module is employed to learn the soft attention mask features from the skeletons, suppressing non-salient local features while accentuating local salient features, thereby bringing similar local features closer in the feature space. Additionally, ample contrastive pairs are generated by expanding contrastive pairs based on salient and non-salient features with global features, which guide the network to learn the semantic representations of the entire skeleton. Therefore, with the attention mask mechanism, SkeAttnCLR learns local features under different data augmentation views. The experiment results demonstrate that the inclusion of local feature similarity significantly enhances skeleton-based action representation. Our proposed SkeAttnCLR outperforms state-of-the-art methods on NTURGB+D, NTU120-RGB+D, and PKU-MMD datasets.The ArXiv version is available at https://arxiv.org/abs/2305.00666. The code and settings are available at this repository: https://github.com/GitHubOfHyl97/SkeAttnCLR.