[1]HUANG Qingming,WANG Shuhui,XU Qianqian,et al.Image video centered cross-media analysis and reasoning[J].CAAI Transactions on Intelligent Systems,2021,16(5):835-848.[doi:10.11992/tis.202105042]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
16
Number of periods:
2021 5
Page number:
835-848
Column:
吴文俊人工智能自然科学奖一等奖
Public date:
2021-09-05
- Title:
-
Image video centered cross-media analysis and reasoning
- Author(s):
-
HUANG Qingming1; 2; WANG Shuhui2; XU Qianqian2; LI Liang2; JIANG Shuqiang2
-
1. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China;
2. Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
-
- Keywords:
-
cross-media; image video; unified representation; correlative understanding; explainable reasoning; Human-computer collaboration; knowledge graph; content management and service
- CLC:
-
TP37
- DOI:
-
10.11992/tis.202105042
- Abstract:
-
How to surpass the heterogeneity gap and semantic gap between the cross-media content and cross-media knowledge, and how to manage and utilize the huge amount of cross-media data effectively are urgent bottleneck problems of developing a new generation of artificial intelligence. Aiming at massive online cross-media content represented by image video and by referring to human perception and cognition mechanisms, this paper undertakes studies on such key technologies as unified representation and symbolic representation of cross-media content, deep correlative understanding of cross-media and human-like cross-media intelligent reasoning. Based on the above technologies, this paper focuses on solving the common problem of knowledge shortage in the development of a new generation of artificial intelligence and carries out a research on the construction of large-scale cross-media knowledge graph and the human-machine cooperation based labeling technology, to provide strong support for the advancement from cross-media perception to cognition and further provide feasible solutions towards cross-media content management and popular service applications, e.g., cross-media content understanding, retrieval, content transformation and generation, etc.