[1]刘璐,贾彩燕.基于文本扩展模型的网络视频聚类方法[J].智能系统学报,2017,12(6):799-805.[doi:10.11992/tis.201706036]
LIU Lu,JIA Caiyan.Web video clustering method based on an extended text model[J].CAAI Transactions on Intelligent Systems,2017,12(6):799-805.[doi:10.11992/tis.201706036]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
12
期数:
2017年第6期
页码:
799-805
栏目:
学术论文—机器学习
出版日期:
2017-12-25
- Title:
-
Web video clustering method based on an extended text model
- 作者:
-
刘璐1,2, 贾彩燕1,2
-
1. 北京交通大学 交通数据分析与挖掘北京市重点实验室, 北京 100044;
2. 北京交通大学 计算机与信息技术学院, 北京 100044
- Author(s):
-
LIU Lu1,2, JIA Caiyan1,2
-
1. Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China;
2. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
-
- 关键词:
-
网络视频聚类; 共点击视频; 相关查询词; 文本聚类
- Keywords:
-
web video clustering; co-click videos; relevant inquiry word; text clustering
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.201706036
- 摘要:
-
随着视频分享网站的兴起和快速发展,互联网上的视频数量呈爆炸式增长,对视频的组织及分类成为视频有效使用的基础。视频聚类技术由于只需要考虑视频数据内在的簇结构、不需要人工干预,越来越受到人们的青睐。现有的视频聚类方法有基于视频关键帧视觉相似性的方法、基于视频标题文本聚类的方法、文本和视觉多模态融合的方法。基于视频标题文本聚类的视频聚类方法由于其简便性与高效性而被企业界广泛使用,但视频标题由于其短文本的语义稀疏特性,聚类效果欠佳。为此,本文面向社会媒体视频,提出了一种社会媒体平台上视频相关多源文本融合的视频聚类方法,以克服由于视频标题的短文本带来的语义稀疏问题。不同文本聚类算法上的实验结果证明了多源文本数据融合方法的有效性。
- Abstract:
-
With the rapid rise and development of video sharing websites, there has been an explosive increase in web videos on the Internet. Effective organization and classification are necessary for the valid use of such videos. Video clustering technology has gained increasing popularity because it considers the internal cluster structure of video data, and no manual intervention is necessary. There are many video clustering algorithms in existence, such as those based on the visual similarity of key frames, text clustering of video titles, and multi-model fusion by integrating text and visual features. The video clustering method based on the text clustering of titles has become a widely used method in business because of its simplicity and efficiency. However, it performs poorly due to the semantic sparsity of short titles. Therefore, this paper proposes a video clustering method with related text fusion from multiple sources on social media platforms to overcome the semantic sparsity of short text. The experimental results on different text clustering algorithms demonstrate the effectiveness of this method.
备注/Memo
收稿日期:2017-06-09;改回日期:。
基金项目:国家自然科学基金项目(61473030).
作者简介:刘璐,女,1994年生,硕士研究生,主要研究方向为数据挖掘、文本聚类;贾彩燕,女,1976年生,教授,博士生导师,博士,中国人工智能学会“粗糙集与软计算专业委员会”委员,主要研究方向为数据挖掘、社会计算、生物信息学。发表学术论文50余篇。
通讯作者:贾彩燕.E-mail:cyjia@bjtu.edu.cn.
更新日期/Last Update:
2018-01-03