<-Previous Article Next Article->

[1]KANG Jie,LIU Wei.A crossmodal retrieval method for intelligent matching of decoration cases[J].CAAI Transactions on Intelligent Systems,2022,17(4):714-720.[doi:10.11992/tis.202106012]

Copy

A crossmodal retrieval method for intelligent matching of decoration cases

PDF Download HTML

CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume: 17 Number of periods: 2022 4 Page number: 714-720 Column: 学术论文—自然语言处理与理解 Public date: 2022-07-05

Title:: A crossmodal retrieval method for intelligent matching of decoration cases

Author(s):: KANG Jie; LIU Wei; School of Electrical and Control Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China

Keywords:: text information; style; decoration cases; the customer service system for home decoration; intelligent matching; crossmodal retrieval; style aggregation; dual loss function

CLC:: TP391

DOI:: 10.11992/tis.202106012

Abstract:: An important function in the customer service system for home decoration is providing users with decoration cases of corresponding styles in real-time based on the text information input by users. However, the current realization of this function mainly relies on the manual method, which not only fails to meet users’ demand for quick and timely consulting services but also increases the labor cost of enterprises. This paper proposes a crossmodal retrieval method for intelligent matching of decoration cases to that end. Aiming at the problem that the existing algorithms cannot directly establish the correspondence between texts and decoration cases, a style aggregation module is designed to obtain the uniform style feature of a set of decoration cases, to facilitate the subsequent network to establish a potential semantic relationship between texts and decoration cases and realize crossmodal matching between them. Simultaneously, a dual loss function is constructed to train the model based on the problem of classifying difficult and easy samples in the imaging modality. The experimental results show that the method proposed in this paper achieves better retrieval results on the multimodal dataset of decoration cases.

References:: [1] CAO Da, YU Zhiwang, ZHANG Hanling, et al. Video-based cross-modal recipe retrieval[C]//MM ’19: Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM, 2019: 1685?1693.
[2] 李佳敏, 刘兴波, 聂秀山, 等. 三元组深度哈希学习的司法案例相似匹配方法[J]. 智能系统学报, 2020, 15(6): 1147–1153
LI Jiamin, LIU Xingbo, NIE Xiushan, et al. Triplet deep Hashing learning for judicial case similarity matching method[J]. CAAI transactions on intelligent systems, 2020, 15(6): 1147–1153
[3] MORIK M, SINGH A, HONG J, et al. Controlling fairness and bias in dynamic learning-to-rank[C]//SIGIR ’20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2020: 429?438.
[4] WU Fei, JING Xiaoyuan, WU Zhiyong, et al. Modality-specific and shared generative adversarial network for cross-modal retrieval[J]. Pattern recognition, 2020, 104: 107335.
[5] WANG Zihao, LIU Xihui, LI Hongsheng, et al. CAMP: cross-modal adaptive message passing for text-image retrieval[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 5764?5773.
[6] PENG Yuxin, HUANG Xin, ZHAO Yunzhen. An overview of cross-media retrieval: concepts, methodologies, benchmarks, and challenges[J]. IEEE transactions on circuits and systems for video technology, 2017, 28(9): 2372–2385.
[7] RASIWASIA N, PEREIRA J C, COVIELLO E, et al. A new approach to cross-modal multimedia retrieval[C]//MM ’10: Proceedings of the 18th ACM international conference on Multimedia. New York: ACM, 2010: 251?260.
[8] KAN Meina, SHAN Shiguang, ZHANG Haihong, et al. Multi-view discriminant analysis[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 38(1): 188–194.
[9] ZHAI Xiaohua, PENG Yuxin, XIAO Jianguo. Learning cross-media joint representation with sparse and semisupervised regularization[J]. IEEE transactions on circuits and systems for video technology, 2013, 24(6): 965–978.
[10] PENG Yuxin, QI Jinwei, HUANG Xin, et al. CCL: cross-modal correlation learning with multigrained fusion by hierarchical network[J]. IEEE transactions on multimedia, 2017, 20(2): 405–420.
[11] ZHANG Yifan, ZHOU Wengang, WANG Min, et al. Deep relation embedding for cross-modal retrieval[J]. IEEE transactions on image processing, 2020, 30: 617–627.
[12] CHEN Shizhe, ZHAO Yida, JIN Qin, et al. Fine-grained video-text retrieval with hierarchical graph reasoning[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10635?10644.
[13] WEI Yunchao, ZHAO Yao, LU Canyi, et al. Cross-modal retrieval with CNN visual features: a new baseline[J]. IEEE transactions on cybernetics, 2016, 47(2): 449–460.
[14] WANG Bokun, YANG Yang, XU Xing, et al. Adversarial cross-modal retrieval[C]//MM ’17: Proceedings of the 25th ACM international conference on Multimedia. New York: ACM, 2017: 154?162.
[15] ZHEN Liangli, HU Peng, WANG Xu, et al. Deep supervised cross-modal retrieval[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 10386-10395.
[16] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. New York: arXiv, 2018. (2018?10?11)[2021?06?06].https://arxiv.org/abs/1810.04805.
[17] GATYS L A, ECKER A S, BETHGE M. Image style transfer using convolutional neural networks[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2414?2423.
[18] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. San Diego: Mendeley, 2015: 1768?1776.
[19] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999?3007.

Similar References:

Memo

Last Update: 1900-01-01

A crossmodal retrieval method for intelligent matching of decoration cases PDF DownloadHTML

Memo

A crossmodal retrieval method for intelligent matching of decoration cases

PDF Download HTML