[1]闫河,刘灵坤,黄俊滨,等.结合多尺度注意力机制和双向门控循环网络的视频摘要模型[J].智能系统学报,2024,19(2):446-454.[doi:10.11992/tis.202209048]
 YAN He,LIU Lingkun,HUANG Junbin,et al.Video summarization model based on the multiscale attention mechanism and bidirectional gated recurrent network[J].CAAI Transactions on Intelligent Systems,2024,19(2):446-454.[doi:10.11992/tis.202209048]
点击复制

结合多尺度注意力机制和双向门控循环网络的视频摘要模型

参考文献/References:
[1] 冀中, 江俊杰. 基于解码器注意力机制的视频摘要[J]. 天津大学学报(自然科学与工程技术版), 2018, 51(10): 1023–1030
JI Zhong, JIANG Junjie. Video summarization based on decoder attention mechanism[J]. Journal of Tianjin University (science and technology edition), 2018, 51(10): 1023–1030
[2] ELHAMIFAR E, VIDAL R. Sparse subspace clustering: algorithm, theory, and applications[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(11): 2765–2781.
[3] ELHAMIFAR E, SAPIRO G, SASTRY S S. Dissimilarity-based sparse subset selection[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(11): 2182–2197.
[4] ELHAMIFAR E, DE PAOLIS KALUZA M C. Subset selection and summarization in sequential data[M]. [S.l.]: Advances in Neural Information Processing Systems, 2017, 30.
[5] ZHU Wencheng, LU Jiwen, LI Jiahao, et al. DSNet: a flexible detect-to-summarize network for video summarization[J]. IEEE transactions on image processing, 2021, 30: 948–962.
[6] ZHANG Ke, CHAO Weilun, SHA Fei, et al. Video summarization with long short-term memory[M]//Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 766-782.
[7] ZHOU Kaiyang, QIAO Yu, XIANG Tao. Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 7582-7589.
[8] MA Yufei, LU Lie, ZHANG Hongjiang, et al. A user attention model for video summarization[C]//Proceedings of the tenth ACM International Conference on Multimedia. New York: ACM, 2002: 533-542.
[9] JIANG Peng, QIN Xiaolin. Keyframe-based video summary using visual attention clues[J]. IEEE MultiMedia, 2010, 17(2): 64–73.
[10] EJAZ N, MEHMOOD I, BAIK S W. Efficient visual attention based framework for extracting key frames from videos[J]. Signal processing:image communication, 2013, 28(1): 34–44.
[11] ZHU Wencheng, LU Jiwen, HAN Yucheng, et al. Learning multiscale hierarchical attention for video summarization[J]. Pattern recognition, 2022, 122: 108312.
[12] 李依依, 王继龙. 自注意力机制的视频摘要模型[J]. 计算机辅助设计与图形学学报, 2020, 32(4): 652–659
LI Yiyi, WANG Jilong. Self-attention based video summarization[J]. Journal of computer-aided design & computer graphics, 2020, 32(4): 652–659
[13] JI Zhong, XIONG Kailin, PANG Yanwei, et al. Video summarization with attention-based encoder–decoder networks[J]. IEEE transactions on circuits and systems for video technology, 2020, 30(6): 1709–1717.
[14] FAJTL J, SOKEH H S, ARGYRIOU V, et al. Summarizing videos with attention[M]//Computer Vision-ACCV 2018 Workshops. Cham: Springer International Publishing, 2019: 39-54.
[15] MAHASSENI B, LAM M, TODOROVIC S. Unsupervised video summarization with adversarial LSTM networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2982-2991.
[16] LEI Jie, LUAN Qiao, SONG Xinhui, et al. Action parsing-driven video summarization based on reinforcement learning[J]. IEEE transactions on circuits and systems for video technology, 2019, 29(7): 2126–2137.
[17] 李雷霆, 武光利, 郭振洲. 自注意力机制和随机森林回归的视频摘要生成[J]. 计算机工程与应用, 2022, 58(4): 198–205
LI Leiting, WU Guangli, GUO Zhenzhou. Video summarization generation based on self-attention mechanism and random forest regression[J]. Computer engineering and applications, 2022, 58(4): 198–205
[18] ZHOU Juanping, LU Lu. Wide and deep learning for video summarization via attention mechanism and independently recurrent neural network[C]//2020 Data Compression Conference. Snowbird: IEEE, 2020: 407.
[19] 王鈃润, 聂秀山, 杨帆, 等. 基于排序学习的视频摘要[J]. 智能系统学报, 2018, 13(6): 921–927
WANG Xingrun, NIE Xiushan, YANG Fan, et al. Video summarization based on learning to rank[J]. CAAI transactions on intelligent systems, 2018, 13(6): 921–927
[20] XU Huijuan, DAS A, SAENKO K. Two-stream region convolutional 3D network for temporal activity detection[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 41(10): 2319–2332.
[21] POTAPOV D, DOUZE M, HARCHAOUI Z, et al. Category-specific video summarization[M]//Computer Vision-ECCV 2014. Cham: Springer International Publishing, 2014: 540-555.
[22] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C] //Proceedings of the 31st International Conference on Advances in Neural Information Processing Systems. New York: Curran Associates, Inc, 2017: 5998-6008.
[23] GYGLI M, GRABNER H, RIEMENSCHNEIDER H, et al. Creating summaries from user videos[M]//Computer Vision-ECCV 2014. Cham: Springer International Publishing, 2014: 505-520.
[24] SONG Yale, VALLMITJANA J, STENT A, et al. TVSum: Summarizing web videos using titles[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 5179-5187.
[25] DE AVILA S E F, LOPES A P B, DA LUZ A, et al. VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method[J]. Pattern recognition letters, 2011, 32(1): 56–68.
[26] 王坤阳, 高伟, 滕国伟. 基于门控多头注意力机制的视频摘要[J]. 工业控制计算机, 2022, 35(12): 120–122
WANG Kunyang, GAO Wei, TENG Guowei. Video summarization based on gated multi-head attention mechanism[J]. Industrial control computer, 2022, 35(12): 120–122
[27] 梅锋, 周娟平, 陆璐. 结合局部奖励机制的视频摘要技术研究[J]. 计算机工程与应用, 2021, 57(11): 211–218
MEI Feng, ZHOU Juanping, LU Lu. Research on video summarization technology combining local reward mechanism[J]. Computer engineering and applications, 2021, 57(11): 211–218
相似文献/References:
[1]汪权彬,谭营.基于数据增广和复制的中文语法错误纠正方法[J].智能系统学报,2020,15(1):99.[doi:10.11992/tis.202001014]
 WANG Quanbin,TAN Ying.Chinese grammatical error correction method based on data augmentation and copy mechanism[J].CAAI Transactions on Intelligent Systems,2020,15():99.[doi:10.11992/tis.202001014]
[2]毛明毅,吴晨,钟义信,等.加入自注意力机制的BERT命名实体识别模型[J].智能系统学报,2020,15(4):772.[doi:10.11992/tis.202003003]
 MAO Mingyi,WU Chen,ZHONG Yixin,et al.BERT named entity recognition model with self-attention mechanism[J].CAAI Transactions on Intelligent Systems,2020,15():772.[doi:10.11992/tis.202003003]
[3]鲍维克,袁春.面向推荐系统的分期序列自注意力网络[J].智能系统学报,2021,16(2):353.[doi:10.11992/tis.202005028]
 BAO Weike,YUAN Chun.Recommendation system with long-term and short-term sequential self-attention network[J].CAAI Transactions on Intelligent Systems,2021,16():353.[doi:10.11992/tis.202005028]
[4]石拓,张齐,石磊.多尺度视角特征动态融合的盗窃犯罪预测模型[J].智能系统学报,2022,17(6):1104.[doi:10.11992/tis.202203016]
 SHI Tuo,ZHANG Qi,SHI Lei.Prediction model of theft crime based on the dynamic fusion of multiscale perspective characteristics[J].CAAI Transactions on Intelligent Systems,2022,17():1104.[doi:10.11992/tis.202203016]
[5]李祥宇,隋璘,熊伟丽.基于自注意力机制与卷积ONLSTM网络的软测量算法[J].智能系统学报,2023,18(5):957.[doi:10.11992/tis.202211037]
 LI Xiangyu,SUI Lin,XIONG Weili.Soft sensor algorithm based on self-attention mechanism and convolutional ONLSTM network[J].CAAI Transactions on Intelligent Systems,2023,18():957.[doi:10.11992/tis.202211037]
[6]梁艳,温兴,潘家辉.融合全局与局部特征的跨数据集表情识别方法[J].智能系统学报,2023,18(6):1205.[doi:10.11992/tis.202212030]
 LIANG Yan,WEN Xing,PAN Jiahui.Cross-dataset facial expression recognition method fusing global and local features[J].CAAI Transactions on Intelligent Systems,2023,18():1205.[doi:10.11992/tis.202212030]

备注/Memo

收稿日期:2022-09-23。
基金项目:国家重点研发计划“智能机器人”重点专项项目(2018YFB1308602);国家自然科学基金面上项目(61173184);重庆市自然科学基金项目(cstc2018jcyjAX0694).
作者简介:闫河,博士,教授,主要研究方向为图像多尺度几何分析、目标跟踪、模式识别。主持国家自然科学基金面上项目、中国博士后基金项目各1项,重庆市自然科学基金项目、教育部重点实验室访问学者基金项目各2项;以单位负责人参加科技部“十三五”重点研发计划“智能机器人”重点专项项目1项;参研省部级项目10余项。发表学术论文90余篇。E-mail:yanhe@cqut.edu.cn;刘灵坤,硕士研究生,主要研究方向为与深度学习相结合的视频摘要处理、视频理解、目标检测。E-mail:LiuLingK@stu.cqut.edu.cn;黄俊滨黄骏滨,硕士研究生,主要研究方向为与深度学习相结合的视频摘要处理和视频描述方法。E-mail:huangjunbin@2020.cqut.edu.cn
通讯作者:闫河. E-mail:yanhe@cqut.edu.cn

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com