[1]王鈃润,聂秀山,杨帆,等.基于排序学习的视频摘要[J].智能系统学报,2018,13(06):921-927.[doi:10.11992/tis.201806013]
 WANG Xingrun,NIE Xiushan,YANG Fan,et al.Video summarization based on learning to rank[J].CAAI Transactions on Intelligent Systems,2018,13(06):921-927.[doi:10.11992/tis.201806013]
点击复制

基于排序学习的视频摘要(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第13卷
期数:
2018年06期
页码:
921-927
栏目:
出版日期:
2018-10-25

文章信息/Info

Title:
Video summarization based on learning to rank
作者:
王鈃润1 聂秀山2 杨帆2 吕鹏2 尹义龙3
1. 山东大学 计算机科学与技术学院, 山东 济南 250101;
2. 山东财经大学 计算机科学与技术学院, 山东 济南 250014;
3. 山东大学 软件学院, 山东 济南 250101
Author(s):
WANG Xingrun1 NIE Xiushan2 YANG Fan2 LYU Peng2 YIN Yilong3
1. School of Computer Science and Technology, Shandong University, Ji’nan 250101, China;
2. School of Computer Science and Technology, Shandong University of Finance and Economics, Ji’nan 250014, China;
3. School of Software Engineering, Shandong University, Ji’nan 250101, China
关键词:
视频帧摘要提取视频帧排序视频操作视频图像视频深度学习
Keywords:
video framesummaryvideo frame grabbersrankingvideo operationvideo imagesvideodeep learning
分类号:
TP389.1
DOI:
10.11992/tis.201806013
摘要:
视频数据的急剧增加,给视频的浏览、存储、检索等应用带来一系列问题和挑战,视频摘要正是解决此类问题的一个有效途径。针对现有视频摘要算法基于约束和经验设置构造目标函数,并对帧集合进行打分带来的不确定和复杂度高等问题,提出一个基于排序学习的视频摘要生成方法。该方法把视频摘要的提取等价为视频帧对视频内容表示的相关度排序问题,利用训练集学习排序函数,使得排序靠前的是与视频相关度高的帧,用学到的排序函数对帧打分,根据分数高低选择关键帧作为视频摘要。另外,与现有方法相比,该方法是对帧而非帧集合打分,计算复杂度显著降低。通过在TVSum50数据集上测试,实验结果证实了该方法的有效性。
Abstract:
The exponential increase in the number of online videos has resulted in several challenges as regards video browsing, video storing, and video retrieval. These challenges can be effectively solved by video summarization. The existing video summarization methods construct objective functions based on empirical constraints and experience setup resulting from scoring for a set of frames. However, these methods have uncertainty and high complexity; therefore, in this paper, a video summarization method based on learning-to-rank algorithm is proposed. The proposed method considers summary extraction as a correlation ranking problem between frames and video. First, the training set is used to learn the ranking function, which places the frames having high correlation with video in the front position. Then, the score of each frame is calculated using the learned ranking function. Finally, the keyframes with high scores are selected as the video summary. Compared with the existing methods, the proposed method calculates a score for each frame rather than for a set of frames; therefore, computation complexity remarkably decreases. In addition, the effectiveness of the proposed approach is validated using experimental results on TVSum50 dataset.

参考文献/References:

[1] GONG Boqing, CHAO Weilun, GRAUMAN K, et al. Diverse sequential subset selection for supervised video summarization[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada, 2014:2069-2077.
[2] 李佳桐. 自适应视频摘要算法研究[D]. 合肥:中国科学技术大学, 2017. LI Jiatong. Research on adaptive video summarization algorithms[D]. Hefei:University of Science and Technology of China, 2017.
[3] FURINI M, GERACI F, MONTANGERO M, et al. STIMO:STIll and MOving video storyboard for the web scenario[J]. Multimedia tools and applications, 2010, 46(1):47-69.
[4] GUAN Genliang, WANG Zhiyong, LU Shiyang, et al. Keypoint-based keyframe selection[J]. IEEE transactions on circuits and systems for video technology, 2013, 23(4):729-734.
[5] LI Xuelong, ZHAO Bin, LU Xiaoqiang. A general framework for edited video and raw video summarization[J]. IEEE transactions on image processing, 2017, 26(8):3652-3664.
[6] CHAKRABORTY S, TICKOO O, IYER R. Adaptive keyframe selection for video summarization[C]//Proceedings of 2015 IEEE Winter Conference on Applications of Computer Vision. Waikoloa, USA, 2015:702-709.
[7] HU Tongling, LI Zechao, SU Weiyang, et al. Unsupervised video summaries using multiple features and image quality[C]//Proceedings of 2017 IEEE Third International Conference on Multimedia Big Data. Laguna Hills, USA, 2017:117-120.
[8] SUN Ke, ZHU Jiasong, LEI Zhuo, et al. Learning deep semantic attributes for user video summarization[C]//Proceedings of 2017 IEEE International Conference on Multimedia and Expo. Hong Kong, China, 2017:643-648.
[9] CAO Yunbo, XU Jun, LIU Tieyan, et al. Adapting ranking SVM to document retrieval[C]//Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Seattle, Washington, USA, 2006:186-193.
[10] 毕晓君, 冯雪赟. 基于改进深度学习模型C-GRBM的人体行为识别[J]. 哈尔滨工程大学学报, 2018, 39(1):156-162 BI Xiaojun, FENG Xueyun. Human action recognition based on improved depth learning model C-GRBM[J]. Journal of Harbin engineering university, 2018, 39(1):156-162
[11] SONG Yale, VALLMITJANA J, STENT A, et al. Tvsum:summarizing web videos using titles[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:5179-5187.
[12] ZHAO Bin, XING E P. Quasi real-time summarization for consumer videos[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014:2513-2520.

备注/Memo

备注/Memo:
收稿日期:2018-06-06。
基金项目:国家自然科学基金项目(61671274,61573219);中国博士后基金项目(2016M592190);山东省重点研发计划项目(2017CXGC1504);山东省高校优势学科人才团队培育计划.
作者简介:王鈃润,女,1994年生,主要研究方向为机器学习、多媒体信息处理;聂秀山,男,1981年生,教授,博士,主要研究方向为机器学习、多媒体信息处理。中国计算机学会人工智能与模式识别专委会委员、中国人工智能学会机器学习专委会通讯委员,中国计算机学会计算机视觉专委会委员。主持国家自然科学基金面上项目1项、青年项目1项,发表学术论文30余篇;杨帆,男,1983年生,主要研究方向为机器学习、凸优化、生物医学。
通讯作者:聂秀山.E-mail:niexiushan@163.com
更新日期/Last Update: 2018-12-25