CORC  > 兰州理工大学  > 兰州理工大学
Video captioning based on vision transformer and reinforcement learning
Zhao, Hong1; Chen, Zhiwen1; Guo, Lan1; Han, Zeyu2
刊名PEERJ COMPUTER SCIENCE
2022-03-16
卷号8
关键词Video captioning Vision transformer Reinforcement learning Long short-term memory network Computer vision Natural language processing Attention mechanism Encode-decode Deep learning
DOI10.7717/peerj-cs.916
英文摘要Global encoding of visual features in video captioning is important for improving the description accuracy. In this paper, we propose a video captioning method that combines Vision Transformer (ViT) and reinforcement learning. Firstly, Resnet-152 and ResNeXt-101 are used to extract features from videos. Secondly, the encoding block of the ViT network is applied to encode video features. Thirdly, the encoded features are fed into a Long Short-Term Memory (LSTM) network to generate a video content description. Finally, the accuracy of video content description is further improved by fine-tuning reinforcement learning. We conducted experiments on the benchmark dataset MSR-VTT used for video captioning. The results show that compared with the current mainstream methods, the model in this paper has improved by 2.9%, 1.4%, 0.9% and 4.8% under the four evaluation indicators of LEU-4, METEOR, ROUGE-L and CIDEr-D, respectively.
WOS研究方向Computer Science
语种英语
出版者PEERJ INC
WOS记录号WOS:000773302200003
内容类型期刊论文
源URL[http://ir.lut.edu.cn/handle/2XXMBERH/158092]  
专题兰州理工大学
作者单位1.Lanzhou Univ Technol, Sch Comp & Commun, Lanzhou, Gansu, Peoples R China;
2.Lanzhou Univ Technol, Network & Informat Ctr, Lanzhou, Gansu, Peoples R China
推荐引用方式
GB/T 7714
Zhao, Hong,Chen, Zhiwen,Guo, Lan,et al. Video captioning based on vision transformer and reinforcement learning[J]. PEERJ COMPUTER SCIENCE,2022,8.
APA Zhao, Hong,Chen, Zhiwen,Guo, Lan,&Han, Zeyu.(2022).Video captioning based on vision transformer and reinforcement learning.PEERJ COMPUTER SCIENCE,8.
MLA Zhao, Hong,et al."Video captioning based on vision transformer and reinforcement learning".PEERJ COMPUTER SCIENCE 8(2022).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace