Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method | |
Cheng, Wenlong1,2; Huang, Yan1,2; Wang, Liang1,2 | |
2018-08 | |
会议日期 | 2018-08 |
会议地点 | 北京国际会议中心 |
英文摘要 | The pointing problem of visual question answering (VQA) is that given an image and a question which asks for the location of the interested object, find a region that answers the question. It is an important research problem in VQA tasks and has many potential applications in our daily life. Most of the existing work on this task can only solve it in the form of multiple choices, i.e., given candidate answers in advance, and then selecting a correct one. In this paper, we propose a retrieval model, which can not only deal with the multiple-choices task, but also provide a feasible solution for the no-candidate-answer task. The principle of our method is to pull the question and correct answer close, and push the question and incorrect answer away in a common feature space. To our best knowledge, we are the first to use retrieval method to solve the unconstrained (no-candidate-answer) pointing problem of VQA. Furthermore, our proposed method outperforms the state-of-the-art methods on the Visual7W dataset in terms of the pointing problem of VQA. |
语种 | 英语 |
内容类型 | 会议论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/48533] |
专题 | 自动化研究所_智能感知与计算研究中心 |
通讯作者 | Wang, Liang |
作者单位 | 1.中国科学院大学 2.中国科学院自动化研究所,智能感知与计算研究中心 |
推荐引用方式 GB/T 7714 | Cheng, Wenlong,Huang, Yan,Wang, Liang. Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method[C]. 见:. 北京国际会议中心. 2018-08. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论