[1]肖融,孔亮,张岩.基于文本的新闻事件多版本发现模型[J].智能系统学报,2012,7(4):307-314.
 XIAO Rong,KONG Liang,ZHANG Yan.A text clustering model for diverse versions discovery[J].CAAI Transactions on Intelligent Systems,2012,7(4):307-314.
点击复制

基于文本的新闻事件多版本发现模型

参考文献/References:
[1]ALLAN J. Topic detection and tracking: eventbased information organization[M]. Boston: Kluwer Academic Publishers, 2002: 12411253. 
[2]HE T T, QU G Z, LI S W, et al. Semiautomatic hot event detection[C]//Lecture Notes in Computer Science.Hongkong, China, 2006: 1008. 
[3]YU M Q, LUO W H, XU H B, et al. Research on hierarchical topic detection in topic detection and tracking[J]. Journal of Computer Research and Development, 2006, 43(3): 489495. 
[4]邱立坤,龙志祎,钟华,等. 层次化话题发现与跟踪方法及系统实现[J]. 广西师范大学学报:自然科学版, 2007, 25(2): 157160.
QIU Likun, LONG Zhiyi, ZHONG Hua, et al. Hierarchical topic detection and tracking and implementation of system[J]. Journal of Guangxi Normal University: Natural Science Edition, 2007, 25(2): 157160. 
[5]CARTHY J. Lexical chains versus keywords for topic tracking[C]//Proceedings of the 5th International Conference on Intelligent Text Processing and Computational Linguistics. Seoul, Korea, 2004: 507510. 
[6]ALLAN J, CARBONELL J, DODDINGTON G, et al. Topic detection and tracking pilot study final report[C]//Proceedings of the DARPA Broadcasting News Transcript and Understanding Workshop. [S.l.], 1998: 194218. 
[7]YANG Y, PIERCE T, CARBONELL J. A study of retrospective and online event detection[C]//Special Interest Group on Information Retrieval’98. Melbourne, Australia, 1998: 2836. 
[8]ALLAN J, PAPKA R, LAVRENKO V. Online new event detection and tracking[C]//Special Interest Group on Information Retrieval’98. Melbourne, Australia, 1998: 3745. 
[9]BRANTS T, CHEN F, FARAHAT A. A system for new event detection[C]//Special Interest Group on Information Retrieval’03. Toronto, Canada, 2003: 330337. 
[10]NALLAPATI R, FENG A, PENG F, et al. Event threading within news topics[C]// International Conference on Information and Knowledge Management. Washington, DC, USA, 2004: 446453. 
[11]STEINBACH M, KARYPIS G, KUMAR V. A comparison of document clustering techniques[EB/OL]. [20110514].http://www.cs.cmu.edu/~dunja/KDDpapers/Steinbach_IR.pdf.
[12]PAUL S B, USAMA M F. Refining initial points for Kmeans clustering[C]//Proceedings of the Fifteenth International Conference on Machine Learning. San Francisco, USA, 1998: 9199.
[13]JAIN A K, MURTY M N, FLYNN P J. Data clustering: a review[J]. ACM Computing Surveys, 1999, 31(3): 264333.
[14]RYMOND T, HAN J W. Efficient and effective clustering methods for spatial data mining[C]//Proceedings of the 20th International Conference on Very Large Data Bases. Hong Kong, China, 1994: 144155.
[15]KONG L, YAN R, HE Y J, et al. DVD: a model for event diversified versions discovery[C]//AsiaPacific Web Conference’11. Beijing, China, 2011: 1820. 
[16]FLAKE G W, LAWRENCE S, GILES C L. Efficient identification of Web communities[C]//International Conference on Knowledge Discovery and Data Mining’00. Boston, USA, 2000: 160169. 
[17]ROCCHIO J. Relevance feedback in information retrieval[C]//The SMART Retrieval System: Experiments in Automatic Document Processing. Englewood Cliffs, USA, 1971: 313323. 
[18]DASGUPTA S, NG V. Towards subjectifying text clustering[C]// Special Inspector General for Iraq Reconstruction’10. Geneva, Switzerland, 2010: 483490. 
[19]DUMAIS S T, PLATT J, HECKERMAN D, et al. Inductive learning algorithms and representations for text categorization[C]// Proceedings of the Seventh Internat ional Conference on Information and Knowledge Management. New York, USA, 1998: 148155. 
[20]FRANZ M, WARD T, MCCARLEY J S, et al. Unsupervised and supervised clustering for topic tracking[C]//Special Inspector General for Iraq Reconstruction’01. New Orlean, USA, 2001: 310317. 
[21]BLEI D M, ANDREW Y NG, MICHAEL I J. Latent Dirichlet allocation[J]. The Journal of Machine Learning Research, 2003(3): 9931022. 
[22]WEI X, CROFT W B. LDAbased document models for adhoc retrieval[C]//Proceedings of the 29th Special Inspector General for Iraq Reconstruction Conference. New York, USA, 2006: 178185. 
[23]BHATTACHARYA I, GETOOR I. A latent Dirichlet model for unsupervised entity resolution[C]//SIAM International Conference on Data Mining. Bethesda, USA, 2006: 4758. 
[24]JEROME R B. A novel word clustering algorithm based on latent semantic analysis[C]//Acoustics, Speech, and Signal Processing 1996. [S.l.], 1996: 172175. 
[25]SUN B, SHI L, KONG L, et al. Describing web topics meticulously through word graph analysis[C]//The IEEE Conference on Instructional Technologies ’09. Xiamen, China, 2009: 1114. 
[26]PAGE L, BRIN S, MOTWANI R, et al. The pagerank citation ranking: bringing order to the web[C]//Proceedings of the 7th International World Wide Web Conference. Brisbane, Australia, 1998: 161172. 
[27]KAREN J S. A statistical interpretation of term specificity and its application in retrieval[J]. Journal of Documentation, 1972, 28(1): 1121. 
[28]HARTIGANJ A, WONG M A. A Kmeans clustering algorithm[J]. Journal of the Royal Statistical Society, Series C: Applied Statistics, 1979, 28(1): 100108. 
[29]PELLEG D, MOORE A W. Xmeans: extending Kmeans with efficient estimation of the number of clusters[C]//Proceedings of the Seventeenth International Conference on Machine Learning. Stanford, USA, 2000: 727734.
[30]MACQUEEN J B. Some methods for classification and analysis of multivariate observations[C]//Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967: 281297. 

备注/Memo

收稿日期: 2011-11-24.
基金项目:国家自然科学基金资助项目(61703081).
通信作者:肖融.
E-mail:xrsmile@gmail.com.
作者简介:
肖融,女,1989年生,硕士研究生,主要研究方向为数据挖掘、信息检索等.
孔亮,男,1985年生,硕士研究生,主要研究方向为数据挖掘、信息检索等.
张岩,男,1970年生,副教授,博士,主要研究方向为web信息处理、智能搜索技术、文本分析与数据挖掘、数据库性能等,发表学术论文多篇.

更新日期/Last Update: 2012-09-26
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com