Deep cross-modal retrieval for remote sensing image and audio
Mao, Gou1,2; Yuan, Yuan1; Xiaoqiang, Lu1
2018-10-08
会议日期2018-08-19
会议地点Beijing, China
DOI10.1109/PRRS.2018.8486338
英文摘要

Remote sensing image retrieval has many important applications in civilian and military fields, such as disaster monitoring and target detecting. However, the existing research on image retrieval, mainly including to two directions, text based and content based, cannot meet the rapid and convenient needs of some special applications and emergency scenes. Based on text, the retrieval is limited by keyboard inputting because of its lower efficiency for some urgent situations and based on content, it needs an example image as reference, which usually does not exist. Yet speech, as a direct, natural and efficient human-machine interactive way, can make up these shortcomings. Hence, a novel cross-modal retrieval method for remote sensing image and spoken audio is proposed in this paper. We first build a large-scale remote sensing image dataset with plenty of manual annotated spoken audio captions for the cross-modal retrieval task. Then a Deep Visual-Audio Network is designed to directly learn the correspondence of image and audio. And this model integrates feature extracting and multi-modal learning into the same network. Experiments on the proposed dataset verify the effectiveness of our approach and prove that it is feasible for speech-to-image retrieval. ? 2018 IEEE.

产权排序1
会议录2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing, PRRS 2018
会议录出版者Institute of Electrical and Electronics Engineers Inc.
语种英语
ISBN号9781538684795
内容类型会议论文
源URL[http://ir.opt.ac.cn/handle/181661/30867]  
专题西安光学精密机械研究所_光学影像学习与分析中心
作者单位1.Chinese Academy of Sciences, Center for OPTical IMagery Analysis and Learning (OPTIMAL), Xi'An Institute of Optics and Precision Mechanics, Xi'an, Shaanxi; 710119, China;
2.University of Chinese Academy of Sciences, Beijing; 100049, China
推荐引用方式
GB/T 7714
Mao, Gou,Yuan, Yuan,Xiaoqiang, Lu. Deep cross-modal retrieval for remote sensing image and audio[C]. 见:. Beijing, China. 2018-08-19.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace