CORC  > 北京大学  > 信息科学技术学院
TencentBoost: A Gradient Boosting Tree System with Parameter Server
Jiang, Jie ; Jiang, Jiawei ; Cui, Bin ; Zhang, Ce
2017
英文摘要Gradient boosting tree (GBT), a widely used machine learning algorithm, achieves state-of-the-art performance in academia, industry, and data analytics competitions. Although existing scalable systems which implement GBT, such as XGBoost and MLlib, perform well for datasets with medium-dimensional features, they can suffer performance degradation for many industrial applications where the trained datasets contain high-dimensional features. The performance degradation derives from their inefficient mechanisms for model aggregation- either map-reduce or all-reduce. To address this high-dimensional problem, we propose a scalable execution plan using the parameter server architecture to facilitate the model aggregation. Further, we introduce a sparse-pull method and an efficient index structure to increase the processing speed. We implement a GBT system, namely TencentBoost, in the production cluster of Tencent Inc. The empirical results show that our system is 2-20x faster than existing platforms.; National Natural Science Foundation of China [61572039]; 973 program [2014CB340405]; Shenzhen Gov Research Project [JCYJ20151014093505032]; Tecent Research Grant (PKU); CPCI-S(ISTP); 281-284
语种英语
出处IEEE 33rd International Conference on Data Engineering (ICDE)
DOI标识10.1109/ICDE.2017.87
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/469897]  
专题信息科学技术学院
软件与微电子学院
推荐引用方式
GB/T 7714
Jiang, Jie,Jiang, Jiawei,Cui, Bin,et al. TencentBoost: A Gradient Boosting Tree System with Parameter Server. 2017-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace