CORC  > 北京大学  > 软件与微电子学院
Hunting Killer Tasks for Cloud System Through Machine Learning: A Google Cluster Case Study
Tang Hongyan ; Li Ying ; Jia Tong ; Wu Zhonghai
2016
关键词killer tasks online recognition time series behavior pattern cloud computing system
英文摘要Motivated by frequent failures in cloud computing systems, we analyze failure frequency and failure continuity of tasks from the Google cloud cluster, and find what we call killer tasks that suffer from frequent failures and repeated rescheduling. Killer tasks cause unnecessary resource wasting and significant increase of scheduling workloads, which can be a big concern in cloud systems. We aim to recognize killer tasks at the very early stage of their occurrence so that they can be addressed proactively instead of being rescheduled repeatedly, so as to promote reliability and save resources. To recognize killer tasks from a large amount of tasks in real time is really challenging. In this paper, we first investigate characteristics and behavior patterns of killer tasks and then develop two machine learning based methods, K-HUNTER and C-HUNTER, for online recognition of killer tasks. The empirical results show that our approach performs at 97% of precision in recognizing killer tasks with an 89% timing advance and 88% of resource saving for the cloud system on average.; CPCI-S(ISTP); 1-12
语种英语
出处IEEE International Conference on Software Quality, Reliability and Security (QRS)
DOI标识10.1109/QRS.2016.11
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/460154]  
专题软件与微电子学院
推荐引用方式
GB/T 7714
Tang Hongyan,Li Ying,Jia Tong,et al. Hunting Killer Tasks for Cloud System Through Machine Learning: A Google Cluster Case Study. 2016-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace