<-上一篇/Previous Article 下一篇/Next Article->

[1]王子豪,夏秀山,曹洋,等.基于序列分析的多模态石化VOCs烟羽语义分割[J].智能系统学报,2025,20(6):1420-1431.[doi:10.11992/tis.202501034]
　WANG Zihao,XIA Xiushan,CAO Yang,et al.Multimodal sequence-based petrochemical VOCs plume semantic segmentation[J].CAAI Transactions on Intelligent Systems,2025,20(6):1420-1431.[doi:10.11992/tis.202501034]

点击复制

基于序列分析的多模态石化VOCs烟羽语义分割

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 20 期数: 2025年第6期页码: 1420-1431 栏目: 学术论文—机器感知与模式识别出版日期: 2025-11-05

Title:: Multimodal sequence-based petrochemical VOCs plume semantic segmentation

作者:: 王子豪¹, 夏秀山¹, 曹洋², 张锟宇³; 1. 中国科学技术大学先进技术研究院, 安徽合肥 230031;
2. 中国科学技术大学自动化系, 安徽合肥 230027;
3. 合肥综合性国家科学中心人工智能研究院, 安徽合肥 230088

Author(s):: WANG Zihao¹, XIA Xiushan¹, CAO Yang², ZHANG Kunyu³; 1. Institute of Advanced Technology, University of Science & Technology of China, Hefei 230031, China;
2. Department of Automation, University of Science & Technology of China, Hefei 230027, China;
3. Institute of Artificial Intelligence, Hefei Comp

关键词:: VOCs烟羽; 气体检测; 语义分割; 运动信息; 扩散; 多模态特征融合; 红外图像; 边缘模糊

Keywords:: VOCs plume; gas detectors; semantic segmentation; motion information; diffusion; multimodal feature fusion; infrared imaging; blurred edge

分类号:: TP391

DOI:: 10.11992/tis.202501034

摘要:: 石化挥发性有机化合物(volatile organic compounds， VOCs)烟羽在红外成像下表现出形态扭曲多变、边缘模糊和半透明的特性，直接使用现有的图像语义分割方法难以提取气体特征，分割效果不佳。为此本文提出一种结合上下文序列图像的多模态石化VOCs烟羽分割方法，利用烟羽边缘的扩散特性提取目标帧的前后帧运动扩散矢量，通过叠加运动信息增强VOCs烟羽边缘特征。利用VOCs在可见光下不成像的特点，设计自适应权重模块融合可见光和红外光图像特征，进一步增强烟羽特征，过滤背景干扰。引入一种基于区域代理的烟羽分割解码器，加强烟羽边缘和中心特征的关联性，同时降低烟羽分割计算量。此外，本文构建了石化VOCs可见光与红外视频数据集，在数据集上的实验结果表明，与基线网络相比，本文方法计算效率提高了1.81帧/s，同时分割精度提高了3.53%。

Abstract:: Petrochemical volatile organic compounds (VOCs) plumes manifest distorted and changeable shapes, blurred edges, and translucency under infrared imaging. The implementation of existing image semantic segmentation methods in the direct application context presents significant challenges in the extraction of gas features, resulting in suboptimal outcomes. To address this, this paper proposes a multimodal petrochemical VOCs plume segmentation method (MPPS) that incorporates contextual sequences. Initially, the diffusion characteristics of the plume edge are utilized to extract the motion diffusion vectors of the previous and subsequent frames of the target frame. Subsequently, the edge features of the VOC plume are enhanced by superimposing motion information. Second, an adaptive weight module is designed to leverage the non-imaging characteristics of VOCs in visible light. This module fuses visible and infrared image features, further enhancing plume features and filtering background interference. Finally, a region-based proxy plume segmentation decoder is introduced to enhance the correlation between edge and center features of the plume while reducing the computational load of plume segmentation. Furthermore, this paper constructs a visible and infrared petrochemical VOCs video dataset. Experimental results on this dataset demonstrate that MPPS improves computational efficiency by 1.81 frames per second and segmentation accuracy by 3.53% compared to baseline networks.

参考文献/References:: [1] 张振杰, 李志平, 张苗苗. 红外成像技术在石化装置易挥发性气体泄漏检测中的应用[J]. 山东化工, 2015, 44(12): 159-162.
ZHANG Zhenjie, LI Zhiping, ZHANG Miaomiao. Infrared thermal imaging technology in petrochemical device application of volatile gas leak detection[J]. Shandong chemical industry, 2015, 44(12): 159-162.
[2] 迟晓铭. 石化企业气体泄漏红外成像检测技术实验研究与分析[J]. 红外技术, 2024, 46(8): 947-956.
CHI Xiaoming. Experimental research and analysis of infrared imaging detection technology for gas leakage in petrochemical enterprises[J]. Infrared technology, 2024, 46(8): 947-956.
[3] WANG Jingfan, TCHAPMI L P, RAVIKUMAR A P, et al. Machine vision for natural gas methane emissions detection using an infrared camera[J]. Applied energy, 2020, 257: 113998.
[4] SHOJAIEE F, BALEGHI Y. EFASPP U-Net for semantic segmentation of night traffic scenes using fusion of visible and thermal images[J]. Engineering applications of artificial intelligence, 2023, 117: 105627.
[5] HUO Dong, WANG Jian, QIAN Yiming, et al. Glass segmentation with RGB-thermal image pairs[J]. IEEE transactions on image processing, 2023, 32: 1911-1926.
[6] LI Xiangtai, ZHANG Wenwei, PANG Jiangmiao, et al. Video K-Net: a simple, strong, and unified baseline for video segmentation[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 18825-18835.
[7] YANG Yijun, XING Zhaohu, YU Lequan, et al. Vivim: a video vision mamba for medical video segmentation[EB/OL]. (2024-01-25)[2025-08-29]. https://arxiv.org/abs/2401.14168.
[8] DENG Xing, YU Zhongming, WANG Lin, et al. Smoke image segmentation based on color model[J]. Journal on innovation and sustainability, 2015, 6(2): 130.
[9] MA Zongfang, CAO Yonggen, SONG Lin, et al. A new smoke segmentation method based on improved adaptive density peak clustering[J]. Applied sciences, 2023, 13(3): 1281.
[10] YE Shiping, BAI Zhican, CHEN Huafeng, et al. An effective algorithm to detect both smoke and flame using color and wavelet analysis[J]. Pattern recognition and image analysis, 2017, 27(1): 131-138.
[11] WANG Zewei, YANG Pengfei, LIANG Haotian, et al. Semantic segmentation and analysis on sensitive parameters of forest fire smoke using smoke-unet and landsat-8 imagery[J]. Remote sensing, 2022, 14(1): 45.
[12] YUAN Feiniu, SHI Yu, ZHANG Lin, et al. A cross-scale mixed attention network for smoke segmentation[J]. Digital signal processing, 2023, 134: 103924.
[13] 洪少壮, 胡英, 于宏伟. 基于多特征的红外成像VOCs气体检测[J]. 计算机仿真, 2021, 38(3): 374-379.
HONG Shaozhuang, HU Ying, YU Hongwei. Infrared imaging VOCs gas detection based on multi-feature[J]. Computer simulation, 2021, 38(3): 374-379.
[14] 何自芬, 曹辉柱, 张印辉, 等. 融合注意力分支特征的甲烷泄漏红外图像分割[J]. 红外技术, 2023, 45(4): 417-426.
HE Zifen, CAO Huizhu, ZHANG Yinhui, et al. Infrared image segmentation of methane leaks incorporating attentional branching features[J]. Infrared technology, 2023, 45(4): 417-426.
[15] BOLYA D, ZHOU Chong, XIAO Fanyi, et al. YOLACT: real-time instance segmentation[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9157-9166.
[16] 江逸远, 谷小婧, 顾幸生. 基于红外视频的VOCs泄漏源定位与气羽实例分割[J]. 华东理工大学学报(自然科学版), 2024, 50(5): 695-707.
JIANG Yiyuan, GU Xiaojing, GU Xingsheng. VOCs leakage source location and gas plume instance segmentation based on infrared video[J]. Journal of East China University of Science and Technology, 2024, 50(5): 695-707.
[17] ZHOU Tianfei, LI Jianwu, LI Xueyi, et al. Target-aware object discovery and association for unsupervised video multi-object segmentation[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 6981-6990.
[18] GARG S, GOEL V, KUMAR S. Unsupervised video object segmentation using online mask selection and space-time memory networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. [S.l.]: IEEE, 2020.
[19] WU Yuanlu, CHEN Minghao, WO Yan, et al. Video smoke detection base on dense optical flow and convolutional neural network[J]. Multimedia tools and applications, 2021, 80(28): 35887-35901.
[20] ZHANG Yifei, SIDIB? D, MOREL O, et al. Deep multimodal fusion for semantic image segmentation: a survey[J]. Image and vision computing, 2021, 105: 104042.
[21] ZHANG Yifan, PANG Bo, LU Cewu. Semantic segmentation by early region proxy[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 1248-1258.
[22] DOSOVITSKIY A, BEYER L, KOLESNIKOVA, et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. (2020-10-22)[2024-01-01]. https://arxiv.org/abs/2010.11929.
[23] ILG E, MAYER N, SAIKIA T, et al. FlowNet 2.0: evolution of optical flow estimation with deep networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1647-1655.
[24] ZHU Liying, WANG Ang, JIN Fang. Using image processing technology and general fluid mechanics principles to model smoke diffusion in forest fires[J]. Fluid dynamics & materials processing, 2021, 17(5): 1213-1222.
[25] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.
[26] LUO Chunjie, ZHAN Jianfeng, XUE Xiaohe, et al. Cosine normalization: using cosine similarity instead of dot product in neural networks[C]//Artificial Neural Networks and Machine Learning–ICANN 2018. Cham: Springer International Publishing, 2018: 382-391.
[27] 周晓君, 高媛, 李超杰, 等. 基于多目标优化多任务学习的端到端车牌识别方法[J]. 控制理论与应用, 2021, 38(5): 676-688.
ZHOU Xiaojun, GAO Yuan, LI Chaojie, et al. Multi-objective optimization based multi-task learning for end-to-end license plates recognition[J]. Control theory & applications, 2021, 38(5): 676-688.
[28] LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[C]//International Conference on Learning Representations, [S.l.]: OpenReview.net, 2020.
[29] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International journal of computer vision, 2010, 88(2): 303-338.
[30] CHEN L C, ZHU Yukun, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 833-851.
[31] ZHOU Wujie, LIU Jinfu, LEI Jingsheng, et al. GMNet: graded-feature multilabel-learning network for RGB-thermal urban scene semantic segmentation[J]. IEEE transactions on image processing, 2021, 30: 7790-7802.
[32] STRUDEL R, GARCIA R, LAPTEV I, et al. Segmenter: transformer for semantic segmentation[C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 7242-7252.
[33] RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Cham: Springer International Publishing, 2015: 234-241.
[34] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495.

备注/Memo

收稿日期:2025-1-27。
基金项目:安徽省重点研究与开发计划项目(2022107020030).
作者简介:王子豪，硕士研究生，主要研究方向为计算机视觉、图像分割。E-mail：wzh8096@mail.ustc.edu.cn。;夏秀山，副研究员，主要研究方向为计算机视觉、多模态信息处理。主持、参与国家和省部级科研项目10余项，发表学术论文10余篇。E-mail：xiaxiushan@iat.ustc.edu.cn。;曹洋，副教授，博士生导师，主要研究方向计算机视觉、智能机器人。主持国家重点研发计划项目子课题、国家自然科学基金项目等，获中国自动化学会科技奖一等奖1项，发表学术论文50余篇。E-mail：forrest@ustc.edu.cn。
通讯作者:夏秀山. E-mail：xiaxiushan@iat.ustc.edu.cn

更新日期/Last Update: 1900-01-01

基于序列分析的多模态石化VOCs烟羽语义分割 PDF下载HTML

备注/Memo

基于序列分析的多模态石化VOCs烟羽语义分割

PDF下载 HTML