[1]WANG Luyao,WANG Fengsui,YAN Tao,et al.Cross-modal person re-identification combining multi-scale features and confusion learning[J].CAAI Transactions on Intelligent Systems,2024,19(4):898-908.[doi:10.11992/tis.202304010]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
19
Number of periods:
2024 4
Page number:
898-908
Column:
学术论文—机器感知与模式识别
Public date:
2024-07-05
- Title:
-
Cross-modal person re-identification combining multi-scale features and confusion learning
- Author(s):
-
WANG Luyao1; 2; 3; WANG Fengsui1; 2; 3; YAN Tao1; 2; 3; CHEN Yuanmei1; 2; 3
-
1. School of Electrical Engineering, Anhui Polytechnic University, Wuhu 241000, China;
2. Anhui Key Laboratory of Detection Technology and Energy Saving Devices, Anhui Polytechnic University, Wuhu 241000, China;
3. Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, Anhui Polytechnic University, Wuhu 241000, China
-
- Keywords:
-
machine vision; person re-identification; cross-modal; multi-scale characteristics; coarse-grain; fine-grain; confusion learning; modal independent attribute
- CLC:
-
TP391.4
- DOI:
-
10.11992/tis.202304010
- Abstract:
-
The difficulties of cross-modal person re-identification research mainly come from the huge modal differences and intra-modal differences between pedestrian images. To address these issues, a network structure combining multi-scale features with obfuscation learning is proposed. In order to achieve high-efficiency feature extraction and reduce intra-modal differences, the network is designed as a complementary form of multi-scale features to learn local refinement features and global rough features of pedestrians respectively. The feature expression ability of the network is enhanced from fine-grained and coarse-grained aspects. Confusion learning strategy is used to fuzzy the modal identification feedback of the network, and mine the stable and effective modal-independent attributes to cope with modal differences, so as to improve the robustness of features to modal changes. In the all-search mode of the large-scale data set SYSU-MM01, the results of the first hit rate and mean average precision (mAP) of the algorithm are 76.69% and 72.45%, respectively. In the Visible to Infrared mode of the RegDB data set, the results of the first hit rate and mAP of the algorithm are 94.62% and 94.60%, respectively, which are better than the main existing methods, verifying effectiveness of the proposed method.