[1]毋 非,封化民,申晓晔.容错粗糙模型的事件检测研究[J].智能系统学报,2009,4(02):112-117.
 WU Fei,FENG Hua-min,SHEN Xiao-ye.Research on event detection based on the tolerance rough set model[J].CAAI Transactions on Intelligent Systems,2009,4(02):112-117.
点击复制

容错粗糙模型的事件检测研究(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第4卷
期数:
2009年02期
页码:
112-117
栏目:
出版日期:
2009-04-25

文章信息/Info

Title:
Research on event detection based on the tolerance rough set model
文章编号:
1673-4785(2009)02-0112-06
作者:
毋  非1封化民12申晓晔1
1. 西安电子科技大学通信工程学院 ,陕西西安710071;
 2. 北京电子科技学院多媒体智能处理实验室,北京100070
Author(s):
WU Fei1 FENG Hua-min12 SHEN Xiao-ye1
1. School of Telecommunication Engineering, Xidian University, Xi’an 710071,China;
2. Multimedia Intelligent Information Processing Laberatory,Beijing Electronic Science and Technology Institution, Beijing 100070, China
关键词:
事件检测粗糙集容错粗糙模型
Keywords:
event detection rough set tolerance rough set model
分类号:
TP391
文献标志码:
A
摘要:
对网站发布的Web新闻内容进行必要的、合理的监督管理,是保障网络信息内容安全的重要研究内容.将现有的文本表示模型应用于Web新闻会导致文本表示的稀疏性问题和话题跟踪过程中的主题词漂移问题,一种基于容错粗糙集的文本表示模型解决了这些问题.在理论分析和实验验证的基础上,结合向量空间模型(VSM),利用特征项在文档集中协同出现,构造了特征项的容错粗糙集.然后用特征项容错粗糙集生成文档的容错粗糙模型,来扩充原先的文档表示模型.最后用特征项容错类描述文档之间的相似性关系,实现事件检测过程.实验结果证明,容错粗糙模型能够改进事件检测系统的性能.
Abstract:
Proper monitoring of the content of web news is crucial to the maintenance of network content security. Current text representational models are not suitable for web news because of the sparseness of text representation and the drifting of key words in event tracking processes. To solve these problems, a modeling method for text representation based on tolerance rough sets was used to extend text representation. Following theoretical analysis and experimental verification, we constructed a tolerance rough set for feature terms by considering the vector space model (VSM) and the cooccurrences of feature terms in test sets. Then the tolerance rough set model of tests was generated using the tolerance rough set for feature terms, which extended the original text representation model. Finally, the similarities of texts were described by the feature term’s tolerance classes. Experimental results showed that the tolerance rough set model improved the performance of event detection systems.

参考文献/References:

[1]ALLEN J, CARBONELL J, DODDINGTON G, YAMRON J,YANG Y. Topic detection and tracking pilot study: final report[C]//Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop.Virginia: Lansdowne,1998:194218.
[2]CONNELL M, FENG A, KUMARAN G, et al. UMass at TDT 2004[C]//The 7th Topic Detection and Tracking Conference.Gaithersbury, USA,2004:3541.
[3]NALLAPATI R. Semantic language models for topic detection and tracking[C]//Proceedings of HLT NAACL 2003 Student Research Workshop. Edmonton,CA,2003:16.
[4]苏新宁. 信息检索理论与技术[M]. 北京:科学技术文献出版社, 2004:3335.
[5]PAWLAK Z. Rough sets: theoretical aspects of reasoning about data[M]. Dordrecht: Kluwer Academic Publishers, 1991:927.
[6]KOMOROWSKI J, POLKOWSKI L, ANDRZEJ S. Rough sets: a tutorial,a new trend in decisionmaking[M].Singapore:Springer, Singapore Pte Ltd,1998:25.
[7]刘 清. Rough集及Rough推理[M].北京:科学出版社, 2003:1113.
[8]SKOWRON A, STEPANIUK J. Generalized approximation spaces[C]// 3rd International Workshop on Rough Sets and Soft Computing[s.l.],1994: 156163. 
[9]YANG Y, PIERCE T, CARBONELL J. A study on retrospective and online event detection[C]// Proc of the SIGIR’98. Melbourne, 1998:2836
[10]BAO HO T, BINH NGUYEN N. Nonhierarchical document clustering based on a tolerance rough set model[J]. International Journal of Intelligent Systems,2002,17 (2):199212
[11]LANG N C. A tolerance rough set approach to clustering web search results[D]. Warsaw: Warsaw University, 2003.
[12]YANG Y, CARBONELL J, JIN C. Topicconditioned novelty detection[C]//Proceeding of the 8th ACM SIGKDD. New York: ACM Press,2002:688693.
[13]The 2003 topic detection and tracking (TDT2003) task definition and evaluation plan [EB/OL].[20030421].http://www.nist.gov/speech/tests/tdt/tdt2003/evalplan.htm.
[14]哈工大信息检索研究室.语言技术平台共享包[EB/OL].[20080612]. http://ir.hit.edu.cn/.
[15]搜狗实验室.互联网语料库2006版[EB/OL]. [20080612].http://www.sogou.com/labs/.
[16]易高翔, 胡和平.一种基于容错粗糙集的Web搜索结果聚类方法[J].计算机研究与发展,2006,43(2):275280.
 YI Gaoxiang, HU Heping. A web search result clustering based on tolerance rough set [J].Journal of Computer Research and Development, 2006, 43(2):275280

相似文献/References:

[1]尹林子,阳春华,桂卫华,等.规则分层约简算法[J].智能系统学报,2008,3(06):492.
 YIN Lin-zi,YANG Chun-hua,GUI Wei-hua,et al.Hierarchical reduction of rules[J].CAAI Transactions on Intelligent Systems,2008,3(02):492.
[2]伞 冶,叶玉玲.粗糙集理论及其在智能系统中的应用[J].智能系统学报,2007,2(02):40.
 SAN Ye,YE Yu-ling.Rough set theory and its application in the intelligent systems[J].CAAI Transactions on Intelligent Systems,2007,2(02):40.
[3]王国胤,张清华,胡 军.粒计算研究综述[J].智能系统学报,2007,2(06):8.
 WANG Guo-yin,ZHANG Qing-hua,HU Jun.An overview of granular computing[J].CAAI Transactions on Intelligent Systems,2007,2(02):8.
[4]裴小兵,吴 涛,陆永忠.最小化决策规则集的计算方法[J].智能系统学报,2007,2(06):65.
 PEI Xiao-bing,WU Tao,LU Yong-zhong.Calculating method for a minimal set of decision rules[J].CAAI Transactions on Intelligent Systems,2007,2(02):65.
[5]张志飞,苗夺谦.基于粗糙集的文本分类特征选择算法[J].智能系统学报,2009,4(05):453.[doi:10.3969/j.issn.1673-4785.2009.05.011]
 ZHANG Zhi-fei,MIAO Duo-qian.Feature selection for text categorization based on rough set[J].CAAI Transactions on Intelligent Systems,2009,4(02):453.[doi:10.3969/j.issn.1673-4785.2009.05.011]
[6]马胜蓝,叶东毅.一种带禁忌搜索的粒子并行子群最小约简算法[J].智能系统学报,2011,6(02):132.
 MA Shenglan,YE Dongyi.A minimum reduction algorithm based on parallel particle subswarm optimization with tabu search capability[J].CAAI Transactions on Intelligent Systems,2011,6(02):132.
[7]顾成杰,张顺颐,杜安源.结合粗糙集和禁忌搜索的网络流量特征选择[J].智能系统学报,2011,6(03):254.
 GU Chengjie,ZHANG Shunyi,DU Anyuan.Feature selection of network traffic using a rough set and tabu search[J].CAAI Transactions on Intelligent Systems,2011,6(02):254.
[8]张丽坤,孙建德,李静.视觉关注转移的事件检测算法[J].智能系统学报,2012,7(04):333.
 ZHANG Likun,SUN Jiande,LI Jing.Event detection based on visual attention shift[J].CAAI Transactions on Intelligent Systems,2012,7(02):333.
[9]周丹晨.采用粒计算的属性权重确定方法[J].智能系统学报,2015,10(02):273.[doi:10.3969/j.issn.1673-4785.201312008]
 ZHOU Danchen.A method for ascertaining the weight of attributes based on granular computing[J].CAAI Transactions on Intelligent Systems,2015,10(02):273.[doi:10.3969/j.issn.1673-4785.201312008]
[10]陈坚,陈健,邵毅明,等.粗糙集的过饱和多交叉口协同优化模型研究[J].智能系统学报,2015,10(5):783.[doi:10.11992/tis.201406045]
 CHEN Jian,CHEN Jian,SHAO Yiming,et al.Collaborative optimization model for oversaturated multiple intersections based on the rough set theory[J].CAAI Transactions on Intelligent Systems,2015,10(02):783.[doi:10.11992/tis.201406045]

备注/Memo

备注/Memo:
收稿日期:2008-12-16.
作者简介:
毋  非,女,1984年生,硕士研究生.主要研究方向为Web新闻内容安全、信息检索.
封化民,男,1963年生,教授,硕士生导师.主要研究方向为多媒体智能信息处理、网络安全.
申晓晔,女,1984年生,硕士研究生.主要研究方向为Web新闻内容安全、舆情倾向性分析.
通信作者:毋非.E-mail:wuf@besti.cn
更新日期/Last Update: 2009-05-04