Zero-Shot Predicate Prediction for Scene Graph Parsing
Li, Yiming2; Yang, Xiaoshan1,4,5; Huang, Xuhui3; Ma, Zhe3; Xu, Changsheng1,4,5
刊名IEEE TRANSACTIONS ON MULTIMEDIA
2023
卷号25页码:3140-3153
关键词Deep learning zero-shot scene graph
ISSN号1520-9210
DOI10.1109/TMM.2022.3155928
通讯作者Xu, Changsheng(csxu@nlpr.ia.ac.cn)
英文摘要The scene graph is a structured semantic representation of an image, which represents objects and relationships with vertices and edges, respectively. Since it is impossible to manually label all potential relationships in the real world, some previous methods try to apply the zero-shot method for scene graph generation. However, existing methods take triplet (i.e., (subject -predicate -object)) as the basic unit of a relationship. Each element (i.e., subject, predicate, or object) of the unseen relationship is actually seen in the training data. Therefore, they ignore the unseen predicate. To predict the unseen predicate, we introduce a novel task named zero-shot predicate prediction, which is crucial to extending existing scene graph generation methods to recognize more relationship classes. The new task is challenging and cannot be simply resolved through conventional zero-shot learning methods because there is a large intra-class variation of each predicate. Firstly, the large intra-class variation leads to the difficulty of computing the discriminative instance-level feature of the predicate class. Secondly, the large intra-class variation also brings more difficulties when knowledge is transferred from seen classes to unseen classes. For the first challenge, we propose distilling lexical knowledge of different objects and construct multi-modal representations of pairwise objects to reduce the intra-class variation of the predicate. To respond to the second challenge, we build a compact semantic space where the representations of unseen classes are reconstructed based on the seen classes for zero-shot predicate classification. We evaluate the proposed method on the public dataset Visual Genome. The extensive experiment results under the zero-shot/few-shot/supervised settings demonstrate the effectiveness of the proposed method.
资助项目National Key Research and Development Program of China[2018AAA0100604] ; National Natural Science Foundation of China[61720106006] ; National Natural Science Foundation of China[62036012] ; National Natural Science Foundation of China[61721004] ; National Natural Science Foundation of China[62072455] ; National Natural Science Foundation of China[U1836220] ; National Natural Science Foundation of China[U1705262] ; National Natural Science Foundation of China[61872424] ; Key Research Program of Frontier Sciences of CAS[QYZDJ-SSW-JSC039] ; Beijing Natural Science Foundation[L201001]
WOS研究方向Computer Science ; Telecommunications
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:001045742200015
资助机构National Key Research and Development Program of China ; National Natural Science Foundation of China ; Key Research Program of Frontier Sciences of CAS ; Beijing Natural Science Foundation
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/54026]  
专题多模态人工智能系统全国重点实验室
通讯作者Xu, Changsheng
作者单位1.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
2.Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Peoples R China
3.CASIC, Acad 2, Lab 10, Beijing 100854, Peoples R China
4.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
5.PengCheng Lab, Shenzhen 518066, Peoples R China
推荐引用方式
GB/T 7714
Li, Yiming,Yang, Xiaoshan,Huang, Xuhui,et al. Zero-Shot Predicate Prediction for Scene Graph Parsing[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2023,25:3140-3153.
APA Li, Yiming,Yang, Xiaoshan,Huang, Xuhui,Ma, Zhe,&Xu, Changsheng.(2023).Zero-Shot Predicate Prediction for Scene Graph Parsing.IEEE TRANSACTIONS ON MULTIMEDIA,25,3140-3153.
MLA Li, Yiming,et al."Zero-Shot Predicate Prediction for Scene Graph Parsing".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):3140-3153.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace