[1]GU Linazi,SUN Tieli,YI Liyaer,et al.An approach to the text categorization of the Kazakh language based on an active learning support vector machine[J].CAAI Transactions on Intelligent Systems,2011,6(3):261-267.
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
6
Number of periods:
2011 3
Page number:
261-267
Column:
学术论文—自然语言处理与理解
Public date:
2011-06-25
- Title:
-
An approach to the text categorization of the Kazakh language based on an active learning support vector machine
- Author(s):
-
GU Linazi1; 2 ; SUN Tieli2 ; YI Liyaer1; WU Di2
-
1.School of Electronic and Information Engineering, Yili Normal University, Yining 835000, China;
2.School of Computer Science and Information Technology, Northeast Normal University, Changchun 130117, China
-
- Keywords:
-
support vector machine; Kazakh text categorization; active learning
- CLC:
-
TP391.1
- DOI:
-
-
- Abstract:
-
In applying the theory of text categorization to the study to the Kazakh language, an approach to text categorization of Kazakh text based on a support vector machine system was introduced. In this paper, from the Kazakh linguistic angle, the method to extract word stems was analyzed. Based on analysis of the support vector machine, the proposed active learning algorithm was adopted for training. The trained classifier was used to classify new text. The experimental results show that this approach to Kazakh text classification has an acceptable classification performance.