AlphaHoldem: High-Performance Artificial Intelligence for Heads-Up No-Limit Poker via End-to-End Reinforcement Learning
Zhao EM(赵恩民)2,3; Yan RY(闫仁业)2,3; Li JQ(李金秋)2,3; Li K(李凯)3; Xing JL(兴军亮)1,2,3
2021-02
会议日期2022-02-22
会议地点线上
DOI
英文摘要
Heads-up no-limit Texas hold’em (HUNL) is the quintessen
tial game with imperfect information. Representative prior
works like DeepStack and Libratus heavily rely on counter
factual regret minimization (CFR) and its variants to tackle
HUNL. However, the prohibitive computation cost of CFR
iteration makes it diffificult for subsequent researchers to learn
the CFR model in HUNL and apply it in other practical ap
plications. In this work, we present AlphaHoldem, a high
performance and lightweight HUNL AI obtained with an end
to-end self-play reinforcement learning framework. The pro
posed framework adopts a pseudo-siamese architecture to di
rectly learn from the input state information to the output ac
tions by competing the learned model with its different his
torical versions. The main technical contributions include a
novel state representation of card and betting information, a
multi-task self-play training loss function, and a new model
evaluation and selection metric to generate the fifinal model.
In a study involving 100,000 hands of poker, AlphaHoldem
defeats Slumbot and DeepStack using only one PC with three
days training. At the same time, AlphaHoldem only takes 2.9
milliseconds for each decision-making using only a single
GPU, more than 1,000 times faster than DeepStack.
语种英语
内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/52251]  
专题融合创新中心_决策指挥与体系智能
作者单位1.Tsinghua University
2.School of Artificial Intelligence, University of Chinese Academy of Sciences
3.Institute of Automation, Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Zhao EM,Yan RY,Li JQ,et al. AlphaHoldem: High-Performance Artificial Intelligence for Heads-Up No-Limit Poker via End-to-End Reinforcement Learning[C]. 见:. 线上. 2022-02-22.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace