An Orthogonal Subspace Decomposition Method for Cross-Modal Retrieval

doi:10.1109/MIS.2022.3169884

CORC > 自动化研究所 > 中国科学院自动化研究所 > 复杂系统管理与控制国家重点实验室 > 互联网大数据与安全信息学研究中心

	An Orthogonal Subspace Decomposition Method for Cross-Modal Retrieval
	Zeng, Zhixiong 1,2; Xu, Nan 1,2; Mao, Wenji 1,2; Zeng, Daniel 1,2
刊名	IEEE INTELLIGENT SYSTEMS
	2022-05-01
卷号	37 期号:3 页码:45-53
关键词	Semantics Representation learning Task analysis Matrix decomposition Automation Interference Intelligent systems Cross-modal Retrieval Representation Learning Orthogonal Decomposition
ISSN号	1541-1672
DOI	10.1109/MIS.2022.3169884
通讯作者	Zeng, Zhixiong()
英文摘要	As a general characteristic observed in the real-world datasets, multimodal data are usually partially associated, which comprise the commonly shared information across modalities (i.e., modality-shared information) and the specific information only exists in a single modality (i.e., modality-specific information). Cross-modal retrieval methods typically use these information in multimodal data as a whole and project them into a common representation space to calculate the similarity measure. In fact, only modality-shared information can be well aligned in the learning of common representations, whereas modality-specific information usually brings about interference term and decreases the performance of cross-modal retrieval. The explicit distinction and utilization of these two kinds of multimodal information are important to cross-modal retrieval, but rarely studied in previous research. In this article, we explicitly distinguish and utilize modality-shared and modality-specific features for learning better common representations, and propose an orthogonal subspace decomposition method for cross-modal retrieval, named orthogonal subspace decomposition method. Specifically, we introduce a structure preservation loss to ensure modality-shared information to be well preserved, and optimize the intramodal discrimination loss and intermodal invariance loss to learn the semantic discriminative features for cross-modal retrieval. We conduct comprehensive experiments on four widely used benchmark datasets, and the experimental results demonstrate the effectiveness of our proposed method.
资助项目	Ministry of Science and Technology of China[2020AAA0108405] ; National Natural Science Foundation of China[71621002] ; National Natural Science Foundation of China[11832001]
WOS研究方向	Computer Science ; Engineering
语种	英语
出版者	IEEE COMPUTER SOC
WOS记录号	WOS:000831149400014
资助机构	Ministry of Science and Technology of China ; National Natural Science Foundation of China
内容类型	期刊论文
源URL	[http://ir.ia.ac.cn/handle/173211/49806]
专题	自动化研究所_复杂系统管理与控制国家重点实验室_互联网大数据与安全信息学研究中心
通讯作者	Zeng, Zhixiong
作者单位	1.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100190, Peoples R China 2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Zeng, Zhixiong,Xu, Nan,Mao, Wenji,et al. An Orthogonal Subspace Decomposition Method for Cross-Modal Retrieval[J]. IEEE INTELLIGENT SYSTEMS,2022,37(3):45-53.
APA	Zeng, Zhixiong,Xu, Nan,Mao, Wenji,&Zeng, Daniel.(2022).An Orthogonal Subspace Decomposition Method for Cross-Modal Retrieval.IEEE INTELLIGENT SYSTEMS,37(3),45-53.
MLA	Zeng, Zhixiong,et al."An Orthogonal Subspace Decomposition Method for Cross-Modal Retrieval".IEEE INTELLIGENT SYSTEMS 37.3(2022):45-53.