KM4: Visual reasoning via Knowledge EmbeddingMemoryModel with MutualModulation | |
Zheng, Wenbo1,2; Yan, Lan2,4; Gou, Chao3; Wang, Fei-Yue2 | |
刊名 | INFORMATION FUSION |
2021-03-01 | |
卷号 | 67页码:14-28 |
关键词 | Visual reasoning Knowledge-based representation learning Memory network Knowledge embedding |
ISSN号 | 1566-2535 |
DOI | 10.1016/j.inffus.2020.10.007 |
通讯作者 | Wang, Fei-Yue(feiyue.wang@ia.ac.cn) |
英文摘要 | Visual reasoning is a special kind of visual question answering, which is essentially multi-step and compositional, and also requires intensive text-visual interaction. The most important and challenging problem of visual reasoning is to design an effective and robust visual reasoning model. To this end, there are two challenges to overcome. The first is that textual and visual information must be jointly considered to make accurate inferences about reasoning. The second is that existing deep learning-based works are often too specific to a particular task. To address these issues, we propose a knowledge memory embedding model with mutual modulation for visual reasoning. This approach learns not only knowledge-based embeddings derived from key-value memory network to make the full and joint of textual and visual information, but also exploits the prior knowledge to improve the performance with knowledge-based representation learning for applying other general reasoning tasks. Experimental results on four benchmarks show that the proposed approach significantly improves performance compared with other state-of-the-art methods, guarantees the robustness with our model. Most importantly, we apply our model to four reasoning tasks, and experimentally show that our model effectively supports relational reasoning and improves performance in several tasks and datasets. |
资助项目 | National Natural Science Foundation of China[61806198] ; National Natural Science Foundation of China[61533019] ; National Natural Science Foundation of China[U1811463] ; Key Research and Development Program of Guangzhou[202007050002] ; National Key Research and Development Program of China[2018 AAA0101502] |
WOS关键词 | FUSION ; MEMORY |
WOS研究方向 | Computer Science |
语种 | 英语 |
出版者 | ELSEVIER |
WOS记录号 | WOS:000598348400003 |
资助机构 | National Natural Science Foundation of China ; Key Research and Development Program of Guangzhou ; National Key Research and Development Program of China |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/42538] |
专题 | 自动化研究所_复杂系统管理与控制国家重点实验室_先进控制与自动化团队 |
通讯作者 | Wang, Fei-Yue |
作者单位 | 1.Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China 2.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 3.Sun Yat Sen Univ, Sch Intelligent Syst Engn, Guangzhou 510275, Peoples R China 4.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Zheng, Wenbo,Yan, Lan,Gou, Chao,et al. KM4: Visual reasoning via Knowledge EmbeddingMemoryModel with MutualModulation[J]. INFORMATION FUSION,2021,67:14-28. |
APA | Zheng, Wenbo,Yan, Lan,Gou, Chao,&Wang, Fei-Yue.(2021).KM4: Visual reasoning via Knowledge EmbeddingMemoryModel with MutualModulation.INFORMATION FUSION,67,14-28. |
MLA | Zheng, Wenbo,et al."KM4: Visual reasoning via Knowledge EmbeddingMemoryModel with MutualModulation".INFORMATION FUSION 67(2021):14-28. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论