Dual Learning for Cross-domain Image Captioning
Wei Zhao; Wei XU; Min Yang; Jianbo Ye; Zhou Zhao; Yabing Feng; Yu Qiao
2017
会议地点新加坡
英文摘要Recent AI research has witnessed increasing interests in automatically generating image descriptions in text, which is coined as theimage captioning problem. Significant progresses have been made in domains where plenty of labeled training data (i.e. image-text pairs) are readily available or collected. However, obtaining rich annotated data is a time-consuming and expensive process, creating a substantial barrier for applying image captioning methods to a new domain. In this paper, we propose a cross-domain image captioning approach that uses a novel dual learning mechanism to overcome this barrier. First, we model the alignment between the neural representations of images and that of natural languages in the source domain where one can access sufficient labeled data. Second, we adjust the pre-trained model based on examining limited data (or unpaired data) in the target domain. In particular, we introduce a dual learning mechanism with a policy gradient method that generates highly rewarded captions. The mechanism simultaneously optimizes two coupled objectives: generating image descriptions in text and generating plausible images from text descriptions, with the hope that by explicitly exploiting their coupled relation, one can safeguard the performance of image captioning in the target domain. To verify the effectiveness of our model, we use MSCOCO dataset as the source domain and two other datasets (Oxford-102 and Flickr30k) as the target domains. The experimental results show that our model consistently outperforms previous methods for cross-domain image captioning.
语种英语
内容类型会议论文
源URL[http://ir.siat.ac.cn:8080/handle/172644/11763]  
专题深圳先进技术研究院_集成所
作者单位2017
推荐引用方式
GB/T 7714
Wei Zhao,Wei XU,Min Yang,et al. Dual Learning for Cross-domain Image Captioning[C]. 见:. 新加坡.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace