[1]陈丽,马楠,逄桂林,等.多视角数据融合的特征平衡YOLOv3行人检测研究[J].智能系统学报,2021,16(1):57-65.[doi:10.11992/tis.202010003]
CHEN Li,MA Nan,PANG Guilin,et al.Research on multi-view data fusion and balanced YOLOv3 for pedestrian detection[J].CAAI Transactions on Intelligent Systems,2021,16(1):57-65.[doi:10.11992/tis.202010003]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
16
期数:
2021年第1期
页码:
57-65
栏目:
学术论文—机器感知与模式识别
出版日期:
2021-01-05
- Title:
-
Research on multi-view data fusion and balanced YOLOv3 for pedestrian detection
- 作者:
-
陈丽1, 马楠1,2, 逄桂林3, 高跃4, 李佳洪1,2, 张国平1, 吴祉璇1, 姚永强1
-
1. 北京联合大学 北京市信息服务工程重点实验室,北京 100101;
2. 北京联合大学 机器人学院,北京 100101;
3. 北京交通大学 计算机与信息技术学院,北京 100044;
4. 清华大学 软件学院,北京 100085
- Author(s):
-
CHEN Li1, MA Nan1,2, PANG Guilin3, GAO Yue4, LI Jiahong1,2, ZHANG Guoping1, WU Zhixuan1, YAO Yongqiang1
-
1. Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China;
2. College of Robotics, Beijing Union University, Beijing 100101, China;
3. School of Computer and Information Technology, Beijing Jiaoton
-
- 关键词:
-
多视数据; 自监督学习; 特征点匹配; 特征融合; YOLOv3网络; 平衡特征; 复杂场景; 行人检测
- Keywords:
-
multi-view data; self- supervised learning; feature point matching; feature fusion; YOLOv3 network; balanced feature; complex scene; pedestrian detection
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202010003
- 摘要:
-
针对复杂场景下行人发生遮挡检测困难以及远距离行人检测精确度低的问题,本文提出一种多视角数据融合的特征平衡YOLOv3行人检测模型(MVBYOLO),包括2部分:自监督学习的多视角特征点融合模型(Self-MVFM)和特征平衡YOLOv3网络(BYOLO)。Self-MVFM对输入的2个及以上的视角数据进行自监督学习特征,通过特征点的匹配实现多视角信息融合,在融合时使用加权平滑算法解决产生的色差问题;BYOLO使用相同分辨率融合高层语义特征和低层细节特征,得到平衡的语义增强多层级特征,提高复杂场景下车辆前方行人检测的精确度。为了验证所提出方法的有效性,在VOC数据集上进行对比实验,最终AP值达到80.14%。与原YOLOv3网络相比,本文提出的MVBYOLO模型精度提高了2.89%。
- Abstract:
-
Because of the occlusion and low accuracy of long-distance detection, pedestrian detection in complex scenes is difficult. Therefore, a pedestrian detection method based on multi-view data fusion and balanced YOLOv3 (MVBYOLO) is proposed, including the self-supervised network for multi-view fusion model (Self-MVFM) and balanced YOLOv3 network (BYOLO). Self-MVFM fuses two or more input perspective data through a self-supervised network and incorporates a weighted smoothing algorithm to solve the color difference problem during the fusion; BYOLO uses the same resolution to fuse high- and low-level semantic features to obtain balanced semantic information, thereby enhancing multi-level features and improving the accuracy of pedestrian detection in front of vehicles in complex scenes. A comparative experiment is conducted on the VOC dataset to verify the effectiveness of the proposed method. The final AP value reaches 80.14%. The experimental results indicate that compared with the original YOLOv3 network, the accuracy of the MVBYOLO is increased by 2.89%.
备注/Memo
收稿日期:2020-10-07。
基金项目:国家自然科学基金项目(61871038, 61931012, 6183034);军委装备发展部共性预研计划项目(41412040302);北京联合大学“人才强校优选计划”领军计划(BPHR2020AZ02);北京联合大学研究生科研创新资助项目(YZ2020K001)
作者简介:陈丽,硕士研究生,主要研究方向为多视角数据融合、行人动作识别;马楠,教授,博士,主要研究方向为交互认知、知识发现与智能系统,带领团队分别在2018、2019、2020WIC世界无人驾驶挑战赛虚拟场景赛项获得冠军(领军奖)。授权发明专利7项、软件著作权13项。发表学术论文50余篇,主编专著和教材3部;逄桂林,硕士研究生,主要研究方向为计算机视觉、车道线检测
通讯作者:马楠. E-mail:xxtmanan@buu.edu.cn
更新日期/Last Update:
2021-02-25