[1]LIU Zhuokun,LIU Huaping,HUANG Wenmei,et al.Audiovisual cross-modal retrieval for surface material[J].CAAI Transactions on Intelligent Systems,2019,14(3):423-429.[doi:10.11992/tis.201804030]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
14
Number of periods:
2019 3
Page number:
423-429
Column:
学术论文—智能系统
Public date:
2019-05-05
- Title:
-
Audiovisual cross-modal retrieval for surface material
- Author(s):
-
LIU Zhuokun1; LIU Huaping2; HUANG Wenmei1; WANG Bowen1; SUN Fuchun2
-
1. State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300130, China;
2. State Key Lab of Intelligent Technology and Systems, Tsinghua University, Beijing 100084, China
-
- Keywords:
-
cross-modal retrieval; feature extraction; canonical correlation analysis; subspace mapping; material analysis; convolutional neural network; Mel-frequency cepstral coefficients; Euclidean distance
- CLC:
-
TP391
- DOI:
-
10.11992/tis.201804030
- Abstract:
-
Text and image features sometimes do not allow for true and accurate analysis of the material. To solve this problem, a cross-modal method for surface material retrieval in an audiovisual field is proposed. First, the sound feature is extracted using mel frequency cepstral coefficients (MFCCs), and the image feature is extracted using convolutional neural network (CNN). Then, these two features are mapped to the subspace using canonical correlation analysis and are further retrieved via Euclidean distance. Experimental validation performed using the tactile texture dataset of the Technical University of Munich showed that the proposed method has a good application effect on material retrieval.