[1]XIAO Rong,KONG Liang,ZHANG Yan.A text clustering model for diverse versions discovery[J].CAAI Transactions on Intelligent Systems,2012,7(4):307-314.
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
7
Number of periods:
2012 4
Page number:
307-314
Column:
学术论文—自然语言处理与理解
Public date:
2012-08-25
- Title:
-
A text clustering model for diverse versions discovery
- Author(s):
-
XIAO Rong; KONG Liang; ZHANG Yan
-
Key Laboratory on Machine Perception of MOE, Peking University, Beijing 100871, China
-
- Keywords:
-
diverse versions discovery; highlydifferentiated words; clustering model; topic analysis
- CLC:
-
TP18
- DOI:
-
-
- Abstract:
-
The development of information technology brings numerous news and events to our daily life. Although previous researches have provided various algorithms to detect and track events, few of them focus on uncovering the diversified versions of an event. In this paper, a novel algorithm CDW which is capable of discovering different versions of one event according to the news reports was proposed. First, documents were mapped to the topic layer to get the information of each topic. Then the highlydifferentiated words of each topic were extracted to cluster the documents. At last, various versions of one event were got. Experiments conducted on two data sets show that the algorithm given in this paper is effective and outperforms various related algorithms, including classical methods such as Kmeans and linear discriminant analysis (LDA).