Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method
Cheng, Wenlong1,2; Huang, Yan1,2; Wang, Liang1,2
2018-08
会议日期2018-08
会议地点北京国际会议中心
英文摘要

The pointing problem of visual question answering (VQA) is that given an image and a question which asks for the location of the interested object, find a region that answers the question. It is an important research problem in VQA tasks and has many potential applications in our daily life. Most of the existing work on this task can only solve it in the form of multiple choices, i.e., given candidate answers in advance, and then selecting a correct one. In this paper, we propose a retrieval model, which can not only deal with the multiple-choices task, but also provide a feasible solution for the no-candidate-answer task. The principle of our method is to pull the question and correct answer close, and push the question and incorrect answer away in a common feature space. To our best knowledge, we are the first to use retrieval method to solve the unconstrained (no-candidate-answer) pointing problem of VQA. Furthermore, our proposed method outperforms the state-of-the-art methods on the Visual7W dataset in terms of the pointing problem of VQA.

语种英语
内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/48533]  
专题自动化研究所_智能感知与计算研究中心
通讯作者Wang, Liang
作者单位1.中国科学院大学
2.中国科学院自动化研究所,智能感知与计算研究中心
推荐引用方式
GB/T 7714
Cheng, Wenlong,Huang, Yan,Wang, Liang. Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method[C]. 见:. 北京国际会议中心. 2018-08.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace