CORC  > 自动化研究所  > 中国科学院自动化研究所
Learning Deep Decentralized Policy Network by Collective Rewards for Real-Time Combat Game
Peixi Peng1; Junliang Xing1; Lili Cao1; Lisen Mu2; Chang Huang2
2019
会议日期August 10-16, 2019
会议地点Macao, China
关键词Multi-agent Learning Deep Decentralized Policy Network Real-time Combat Game
英文摘要

The task of real-time combat game is to coordinate multiple units to defeat their enemies controlled by the given opponent in a real-time combat scenario. It is difficult to design a high-level Artificial Intelligence (AI) program for such a task due to its extremely large state-action space and real-time requirements. This paper formulates this task as a collective decentralized partially observable Markov decision process, and designs a Deep Decentralized Policy Network (DDPN) to model the polices. To train DDPN effectively, a novel two-stage learning algorithm is proposed which
combines imitation learning from opponent and reinforcement learning by no-regret dynamics.  Extensive experimental results on various combat
scenarios indicate that proposed method can defeat different opponent models and significantly outperforms many state-of-the-art approaches.

内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/26156]  
专题中国科学院自动化研究所
通讯作者Junliang Xing
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.Horizon Robotics
推荐引用方式
GB/T 7714
Peixi Peng,Junliang Xing,Lili Cao,et al. Learning Deep Decentralized Policy Network by Collective Rewards for Real-Time Combat Game[C]. 见:. Macao, China. August 10-16, 2019.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace