<-Previous Article Next Article->

[1]ZHANG Jianyu,XIE Juanying.ObjectBoxG: object detection algorithm based on GC3 module[J].CAAI Transactions on Intelligent Systems,2024,19(6):1385-1394.[doi:10.11992/tis.202310025]

Copy

ObjectBoxG: object detection algorithm based on GC3 module

PDF Download HTML

CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume: 19 Number of periods: 2024 6 Page number: 1385-1394 Column: 学术论文—机器学习 Public date: 2024-12-05

Title:: ObjectBoxG: object detection algorithm based on GC3 module

Author(s):: ZHANG Jianyu; XIE Juanying; School of Computer Science, Shaanxi Normal University, Xi’an 710119, China

Keywords:: graph convolutional neural network; feature extraction; feature fusion; object detection; deep learning; anchor-freem ethods; feature pyram id network; Object-Box detector; multi-scale features; global features

CLC:: TP181

DOI:: 10.11992/tis.202310025

Abstract:: With the deepening development of the study on object detection tasks, anchor-free methods such as the ObjectBox detector have attracted the attention of researchers. However, the ObjectBox detector has its limitations: it does not fully utilize multiscale features or adequately consider the correlation between target center points and global information. A graph convolution layer module (GConv), which is based on the graph spectrum method, is proposed to learn global image features and address the aforementioned limitations. Additionally, a new module named GC3 combines the proposed GConv module with C3 (cross-stage partial network with 3 conversions) to further extract the original, fine, and global image features. GC3 is combined with the generalized feature pyramid network (GGFPN) to form the GGFPN. The GGFPN is then embedded into the ObjectBox detector, resulting in the ObjectBoxG algorithm. Experiments on benchmark datasets demonstrate that the proposed GC3 module has stronger feature extraction capability than the original C3 module, and the proposed GGFPN network offers superior feature learning capability to GC3. The ObjectBoxG algorithm demonstrates excellent performance in object detection.

References:: [1] 张婷婷, 章坚武, 郭春生, 等. 基于深度学习的图像目标检测算法综述[J]. 电子学报, 2020, 36(7): 15.
ZHANG Tingting, ZHANG Jianwu, GUO Chunsheng, et al. Survey of object detection based on deep learning[J]. Acta electronica sinica, 2020, 36(7): 15.
[2] ZAIDI S S A, ANSARI M S, ASLAM A, et al. A survey of modern deep learning based object detection models[J]. Digital signal processing, 2022, 126: 103514.
[3] LIU Zhuang, MAO Hanzi, WU Chaoyuan, et al. A ConvNet for the 2020s[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11966-11976.
[4] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2020: 213-229.
[5] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[6] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. (2020-10-22) [2020-10-22]. http://arxiv.org/abs/2010.11929.
[7] HAN K, WANG Y, GUO J, et al. Vision Gnn: an image is worth graph of nodes[J]. Advances in neural information processing systems, 2022, 35: 8291-8303.
[8] TAN Mingxing, PANG Ruoming, LE Q V. EfficientDet: scalable and efficient object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778-10787.
[9] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
[10] FU Chengyang, LIU Wei, RANGA A, et al. DSSD: deconvolutional single shot detector[EB/OL]. (2017-01-23) [2020-10-22]. http://arxiv.org/abs/1701.06659.
[11] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 936-944.
[12] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768.
[13] JIANG Yiqi, TAN Zhiyu, WANG Junyan, et al. Giraffedet: A heavy-neck paradigm for object detection[EB/OL]//(2022-02-09)[2022-12-12]. https://arxiv.org/abs/2202.04256.
[14] GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7029-7038.
[15] XIE Juanying, LIU Ran. The study progress of object detection algorithms based on deep learning[J]. Journal of Shaanxi Normal University (natural science edition), 2019, 47(5): 1-9.
[16] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[17] GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
[18] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137-1149.
[19] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2016: 21-37.
[20] LI Zuoxin, YANG Lu, ZHOU Fuqiang. FSSD: feature fusion single shot multibox detector[EB/OL]. (2017-12-04) [2020-10-22]. http://arxiv.org/abs/1712.00960.
[21] LAW H, DENG Jia. CornerNet: detecting objects as paired keypoints[C]//European Conference on Computer Vision. Cham: Springer, 2018: 765-781.
[22] TIAN Zhi, SHEN Chunhua, CHEN Hao, et al. FCOS: fully convolutional one-stage object detection[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9626-9635.
[23] ZAND M, ETEMAD A, GREENSPAN M. ObjectBox: from centers to boxes for anchor-free object detection[C]//AVIDAN S, BROSTOW G, CISSé M, et al. European Conference on Computer Vision. Cham: Springer, 2022: 390-406.
[24] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International journal of computer vision, 2010, 88(2): 303-338.
[25] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2014: 740-755.
[26] BRUNA J, ZAREMBA W, SZLAM A, et al. Spectral networks and locally connected networks on graphs[EB/OL]. (2013-12-21) [2021-01-01]. http://arxiv.org/abs/1312.6203.
[27] MICHELI A. Neural network for graphs: a contextual constructive approach[J]. IEEE transactions on neural networks, 2009, 20(3): 498-511.
[28] HAMMOND D K, VANDERGHEYNST P, GRIBONVAL R. Wavelets on graphs via spectral graph theory[J]. Applied and computational harmonic analysis, 2011, 30(2): 129-150.
[29] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2016-09-09) [2021-01-01]. http://arxiv.org/abs/1609.02907.
[30] LI Qimai, HAN Zhichao, WU Xiaoming. Deeper insights into graph convolutional networks for semi-supervised learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New Orleans: AAAI, 2018.
[31] OONO K, SUZUKI T. Graph neural networks exponentially lose expressive power for node classification[EB/OL]. (2019-05-27) [2021-01-01]. http://arxiv.org/abs/1905.10947.
[32] WANG C Y, MARK LIAO H Y, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle: IEEE, 2020: 1571-1580.
[33] ULTRALYTICS COMPANY. Yolov5. [EB/OL].[2021-01-01]. https://github.com/ultralytics/yolov5/.2021.
[34] YI Jingru, WU Pengxiang, METAXAS D N. ASSD: attentive single shot multibox detector[J]. Computer vision and image understanding, 2019, 189: 102827.
[35] LIU Songtao, HUANG Di, WANG Yunhong. Receptive field block net for accurate and fast object detection[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2018: 404-419.
[36] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999-3007.
[37] REDMON J, FARHADI A. yolov3: an incremental improvement[EB/OL]. (2018-04-08) [2021-01-01]. http://arxiv.org/abs/1804.02767.
[38] DUAN Kaiwen, BAI Song, XIE Lingxi, et al. CenterNet: keypoint triplets for object detection[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 6568-6577.
[39] KIM K, LEE H S. Probabilistic anchor assignment with IoU prediction for object detection[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2020: 355-371.
[40] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 7464-7475.

Similar References:

Memo

Last Update: 2024-11-05

ObjectBoxG: object detection algorithm based on GC3 module PDF DownloadHTML

Memo

ObjectBoxG: object detection algorithm based on GC3 module

PDF Download HTML