字符串 ') and Issue_No=(select Issue_No from OA where Script_ID=@Script_ID) order by ID ' 后的引号不完整。 ') and Issue_No=(select Issue_No from OA where Script_ID=@Script_ID) order by ID ' 附近有语法错误。 视听觉跨模态表面材质检索-《智能系统学报》

[1]刘卓锟,刘华平,黄文美,等.视听觉跨模态表面材质检索[J].智能系统学报,2019,14(03):423-429.[doi:10.11992/tis.201804030]
 LIU Zhuokun,LIU Huaping,HUANG Wenmei,et al.Audiovisual cross-modal retrieval for surface material[J].CAAI Transactions on Intelligent Systems,2019,14(03):423-429.[doi:10.11992/tis.201804030]
点击复制

视听觉跨模态表面材质检索(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第14卷
期数:
2019年03期
页码:
423-429
栏目:
出版日期:
2019-05-05

文章信息/Info

Title:
Audiovisual cross-modal retrieval for surface material
作者:
刘卓锟1 刘华平2 黄文美1 王博文1 孙富春2
1. 河北工业大学 省部共建电工装备可靠性与智能化国家重点实验室, 天津 300130;
2. 清华大学 智能技术与系统国家重点实验室, 北京 100084
Author(s):
LIU Zhuokun1 LIU Huaping2 HUANG Wenmei1 WANG Bowen1 SUN Fuchun2
1. State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300130, China;
2. State Key Lab of Intelligent Technology and Systems, Tsinghua University, Beijing 100084, China
关键词:
跨模态检索特征提取典型相关分析子空间映射材质分析卷积神经网络梅尔频率倒谱系数欧式距离
Keywords:
cross-modal retrievalfeature extractioncanonical correlation analysissubspace mappingmaterial analysisconvolutional neural networkMel-frequency cepstral coefficientsEuclidean distance
分类号:
TP391
DOI:
10.11992/tis.201804030
摘要:
针对文本图像特征有时无法满足对物体材质进行真实准确分析的情况,本文在视听领域使用跨模态检索方法进行表面材质检索。首先提取声音的梅尔频率倒谱系数(MFCC)特征,使用卷积神经网络(CNN)提取图像特征,然后利用典型相关分析将两种特征映射到子空间并用欧氏距离进行检索,并在慕尼黑工业大学触觉纹理数据集上进行实验验证,实现了使用声音检索图像的跨模态检索过程。实验结果表明,所提出的方法在材质检索方面有较好应用效果。
Abstract:
Text and image features sometimes do not allow for true and accurate analysis of the material. To solve this problem, a cross-modal method for surface material retrieval in an audiovisual field is proposed. First, the sound feature is extracted using mel frequency cepstral coefficients (MFCCs), and the image feature is extracted using convolutional neural network (CNN). Then, these two features are mapped to the subspace using canonical correlation analysis and are further retrieved via Euclidean distance. Experimental validation performed using the tactile texture dataset of the Technical University of Munich showed that the proposed method has a good application effect on material retrieval.

参考文献/References:

[1] MANDAL D, BISWAS S. Query specific re-ranking for improved cross-modal retrieval[J]. Pattern Recognition Letters, 2017, 98:110-116.
[2] WANG Kaiye, HE Ran, WANG Liang, et al. Joint feature selection and subspace learning for cross-modal retrieval[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(10):2010-2023.
[3] DENG Cheng, TANG Xu, YAN Junchi, et al. Discriminative dictionary learning with common label alignment for cross-modal retrieval[J]. IEEE transactions on multimedia, 2016, 18(2):208-218.
[4] ZHANG Liang, MA Bingpeng, LI Guorong, et al. Metric based on multi-order spaces for cross-modal retrieval[C]//Proceedings of 2017 IEEE International Conference on Multimedia and Expo. Hong Kong, China, 2017:1374-1379.
[5] 张毅, 谢延义, 罗元, 等. 一种语音特征提取中Mel倒谱系数的后处理算法[J]. 智能系统学报, 2016, 11(2):208-215 ZHANG Yi, XIE Yanyi, LUO Yuan, et al. Postprocessing method of MFCC in speech feature extraction[J]. CAAI transactions on intelligent systems, 2016, 11(2):208-215
[6] WEI Yunchao, ZHAO Yao, LU Canyi, et al. Cross-modal retrieval with CNN visual features:a new baseline[J]. IEEE transactions on cybernetics, 2017, 47(2):449-460.
[7] RANJAN V, RASIWASIA N, JAWAHAR C V. Multi-label cross-modal retrieval[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015:4094-4102.
[8] SHARMA A, KUMAR A, DAUME H, et al. Generalized multiview analysis:a discriminative latent space[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012:2160-2167.
[9] HARDOON D R, SZEDMAK S, SHAWE-TAYLOR J. Canonical correlation analysis:an overview with application to learning methods[J]. Neural Computation, 2004, 16(12):2639-2664.
[10] CHEN Yongming, WANG Liang, WANG Wei, et al. Continuum regression for cross-modal multimedia retrieval[C]//Proceedings of the 19th IEEE International Conference on Image Processing. Orlando, USA, 2013:1949-1952.
[11] MANDAL D, BISWAS S. Generalized coupled dictionary learning approach with applications to cross-modal matching[J]. IEEE transactions on image processing, 2016, 25(8):3826-3837.
[12] STRESE M, SCHUWERK C, IEPURE A, et al. Multimodal feature-based surface material classification[J]. IEEE transactions on haptics, 2017, 10(2):226-239.
[13] CAO Jiuwen, ZHAO Tuo, WANG Jianzhong, et al. Excavation equipment classification based on improved MFCC features and ELM[J]. Neurocomputing, 2017, 261:231-241.
[14] RASIWASIA N, MAHAJAN D, MAHADEVAN V, et al. Cluster canonical correlation analysis[C]//Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics. Reykjavik, Iceland, 2014:823-831.
[15] STRESE M, BOECK Y, STEINBACH E. Content-based surface material retrieval[C]//Proceedings of 2017 IEEE World Haptics Conference. Munich, Germany, 2017:352-357.

相似文献/References:

[1]黄剑华,唐降龙,刘家锋,等.一种基于Homogeneity的文本检测新方法[J].智能系统学报,2007,2(01):69.
 HUANG Jian-hua,TANG Xiang-long,LIU Jia-feng,et al.A new method for text detection based on Homogeneity[J].CAAI Transactions on Intelligent Systems,2007,2(03):69.
[2]谭 营,朱元春.反垃圾电子邮件方法研究进展[J].智能系统学报,2010,5(03):189.
 TAN Ying,ZHU Yuan-chun.Advances in antispam techniques[J].CAAI Transactions on Intelligent Systems,2010,5(03):189.
[3]王斐,张育中,宁廷会,等.脑-机接口研究进展[J].智能系统学报,2011,6(03):189.
 WANG Fei,ZHANG Yuzhong,NING Tinghui,et al.Research progress in a braincomputer interface[J].CAAI Transactions on Intelligent Systems,2011,6(03):189.
[4]刘琚,孙建德.独立分量分析的图像/视频分析与应用[J].智能系统学报,2011,6(06):495.
 LIU Ju,SUN Jiande.Independent component analysisbased image/video analysis and applications[J].CAAI Transactions on Intelligent Systems,2011,6(03):495.
[5]谭营,王军.手指静脉身份识别技术最新进展[J].智能系统学报,2011,6(06):471.
 TAN Ying,WANG Jun.Recent advances in finger vein based biometric techniques[J].CAAI Transactions on Intelligent Systems,2011,6(03):471.
[6]吴家伟,严京旗,方志宏,等.基于图像显著性特征的铸坯表面缺陷检测[J].智能系统学报,2012,7(01):75.
 WU Jiawei,YAN Jingqi,FANG Zhihong,et al.Defect detection on a steel slab surface based on the characteristics of an image’s saliency region[J].CAAI Transactions on Intelligent Systems,2012,7(03):75.
[7]张毅,罗明伟,罗元.脑电信号的小波变换和样本熵特征提取方法[J].智能系统学报,2012,7(04):339.
 ZHANG Yi,LUO Mingwei,LUO Yuan.EEG feature extraction method based on wavelet transform and sample entropy[J].CAAI Transactions on Intelligent Systems,2012,7(03):339.
[8]刘忠宝,王士同.从Parzen窗核密度估计到特征提取方法:新的研究视角[J].智能系统学报,2012,7(06):471.
 LIU Zhongbao,WANG Shitong.From Parzen window estimation to feature extraction: a new perspective[J].CAAI Transactions on Intelligent Systems,2012,7(03):471.
[9]孙倩茹,王文敏,刘宏.视频序列的人体运动描述方法综述[J].智能系统学报,2013,8(03):189.
 SUN Qianru,WANG Wenmin,LIU Hong.Study of human action representation in video sequences[J].CAAI Transactions on Intelligent Systems,2013,8(03):189.
[10]许可乐,唐涛,蒋咏梅.一种SAR图像稳健特征点提取方法[J].智能系统学报,2013,8(04):287.[doi:10.3969/j.issn.1673-4785.201304038]
 XU Kele,TANG Tao,JIANG Yongmei.A stable feature point extraction approach for SAR image registration[J].CAAI Transactions on Intelligent Systems,2013,8(03):287.[doi:10.3969/j.issn.1673-4785.201304038]

备注/Memo

备注/Memo:
收稿日期:2018-04-18。
基金项目:国家自然科学基金重点项目(U1613212);河北省自然科学基金项目(E2017202035).
作者简介:刘卓锟,男,1994年生,硕士研究生,主要研究方向为新型磁性材料与器件、触觉感知与模式识别;刘华平,男,1976年生,副教授,博士生导师,IEEE Senior Member、中国人工智能学会理事,中国人工智能学会认知系统与信息处理专业委员会秘书长,主要研究方向为机器人感知、学习与控制、多模态信息融合。主持国家自然科学基金5项。发表学术论文200余篇,被SCI检索100余篇;黄文美,女,1969年生,教授,主要研究方向为磁性材料与器件、电机及其控制技术。完成国家自然科学基金项目4项、河北省自然科学基金项目2项,主持河北省高层次人才项目1项。发表学术论文40余篇,被SCI、EI、ISTP检索20余篇。
通讯作者:刘华平.E-mail:hpliu@tsinghua.edu.cn
更新日期/Last Update: 1900-01-01