[1]YU Bo,WANG Zhihai,SUN Yadong,et al.Unstructured document sensitive data identification and abnormal behavior analysis[J].CAAI Transactions on Intelligent Systems,2021,16(5):932-939.[doi:10.11992/tis.202104028]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
16
Number of periods:
2021 5
Page number:
932-939
Column:
吴文俊人工智能科技进步奖一等奖
Public date:
2021-09-05
- Title:
-
Unstructured document sensitive data identification and abnormal behavior analysis
- Author(s):
-
YU Bo; WANG Zhihai; SUN Yadong; XIE Fujin; AN Peng
-
Beijing Wondersoft Technology Co., Ltd, Beijing 100876, China
-
- Keywords:
-
data security; artificial intelligence; classification; language model; user’s behavior analysis; sample; nlp; supervised learning
- CLC:
-
TP18;TP319;TP309
- DOI:
-
10.11992/tis.202104028
- Abstract:
-
It is an important research content in the field of data security to classify data quickly and accurately in mass data, and to quickly identify user abnormal behavior. In the field of data classification research, natural language processing technology improves the accuracy of classification, but the problems of mixed Chinese language, low accuracy of unsupervised learning, and large workload of supervised learning sample labeling need to be Chinese made urgently. In the field of user anomaly analysis, due to high sample dependence, which leads to low accuracy of abnormal behavior recognition, this paper proposes to use outlier detection to build an abnormal behavior sample library to solve the problem of excessive sample dependence. In order to verify feasibility of the method, the experimental system is further constructed to carry out experimental analysis, and the proposed method can significantly improve the accuracy of data classification and abnormal behavior analysis.