[1]XIONG Tieniu,QIU Jifang,HU Jian.Collection and sorting method of ancient Yi character images based on deep learning technology[J].CAAI Transactions on Intelligent Systems,2025,20(4):928-935.[doi:10.11992/tis.202406036]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
20
Number of periods:
2025 4
Page number:
928-935
Column:
学术论文—机器学习
Public date:
2025-08-05
- Title:
-
Collection and sorting method of ancient Yi character images based on deep learning technology
- Author(s):
-
XIONG Tieniu1; 2; QIU Jifang3; HU Jian1; 2
-
1. The Key Laboratory for Computer Systems of State Ethnic Affairs Commission, Southwest Minzu University, Chengdu 610225, China;
2. College of Computer Science and Artificial Intelligence, Southwest Minzu University, Chengdu 610225, China;
3. School of Chinese Language and Literature, Southwest Minzu University, Chengdu 610225, China
-
- Keywords:
-
deep learning; ancient Yi characters; ancient literatures; image processing; similarity matching; feature extraction; object detection; digitalization
- CLC:
-
TP391.4; TP391.1
- DOI:
-
10.11992/tis.202406036
- Abstract:
-
The ancient Yi script is one of the important carriers of Chinese culture. However, manually collecting and organizing a large amount of ancient Yi script is time-consuming and labor-intensive. Additionally, very few people can recognize ancient Yi script, and their numbers are dwindling, which makes the task even more difficult. In response to this, this paper proposes a new approach to collecting and organizing images of the ancient Yi script based on deep learning technology. For image collection, the object detection model is used to locate each ancient Yi character in the images of ancient Yi manuscripts, and the characters are extracted from these images accordingly. For image organization, because modern standardized Yi characters are derived from ancient Yi characters, standardized Yi character font files are used to generate images of the Yi characters automatically to construct a dataset. This dataset is then used to train an algorithm for extracting features of ancient Yi script images, which effectively addresses the current lack of an ancient Yi script image dataset due to the large number of characters, many variants, and incomplete organization. Subsequently, matching the features of the collected ancient Yi script images with those of already cataloged images enables determining whether the collected images have been previously recorded and thereby organizing uncatalogued ancient Yi script images. Experiments conducted with various typical feature extraction algorithms and similarity computation methods validate the effectiveness of this approach.