CORC  > 软件研究所  > 软件所图书馆  > 会议论文
a locality-based performance model for load-and-compute style computation
Yuan Liang ; Zhang Yunquan
2012
会议名称IEEE International Conference on Cluster Computing
会议日期SEP 24-28, 2012
会议地点Beijing, PEOPLES R CHINA
关键词locality function cache partition private cache shared cache
页码566-571
中文摘要The increasing speed gap between the processor and memory is usually the critical bottleneck in achieving high performance. Hardware caches, programming models, algorithms and data structures have been introduced and proposed to exploit localities on reducing the memory overhead. Some of these new designs share a common load and compute style in which the algorithm first moves all needed data to cache and then performs operations only on the ready data. In this paper, we introduce a locality function to model the reuse ability of an algorithm and propose a corresponding performance model. Then we theoretically analyze how to utilize and design on cache under our model: (1) We present theorems to give the optimal cache partition scheme for the software buffering technique targeting at hiding the memory overhead. (2) We provide methods to decide the optimal multicore design to maximally leverage benefits of both the shared and private caches. (3) We incorporate the memory overhead into the Amdahl's Law to study the speedup limitation on memory bandwidth.
英文摘要The increasing speed gap between the processor and memory is usually the critical bottleneck in achieving high performance. Hardware caches, programming models, algorithms and data structures have been introduced and proposed to exploit localities on reducing the memory overhead. Some of these new designs share a common load and compute style in which the algorithm first moves all needed data to cache and then performs operations only on the ready data. In this paper, we introduce a locality function to model the reuse ability of an algorithm and propose a corresponding performance model. Then we theoretically analyze how to utilize and design on cache under our model: (1) We present theorems to give the optimal cache partition scheme for the software buffering technique targeting at hiding the memory overhead. (2) We provide methods to decide the optimal multicore design to maximally leverage benefits of both the shared and private caches. (3) We incorporate the memory overhead into the Amdahl's Law to study the speedup limitation on memory bandwidth.
收录类别ISTP ; EI
会议主办者IEEE, IEEE Comp Soc, IEEE Tech Comm Scalable Comp (TCSC), Sugon, Intel, Inspur, VMware, Mellanox, PARATERA, BLSC, LoongStore, Nvidia
会议录Proceedings - 2012 IEEE International Conference on Cluster Computing, CLUSTER 2012
学科主题Computer Science
语种英语
ISSN号1552-5244
内容类型会议论文
源URL[http://ir.iscas.ac.cn/handle/311060/15803]  
专题软件研究所_软件所图书馆_会议论文
推荐引用方式
GB/T 7714
Yuan Liang,Zhang Yunquan. a locality-based performance model for load-and-compute style computation[C]. 见:IEEE International Conference on Cluster Computing. Beijing, PEOPLES R CHINA. SEP 24-28, 2012.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace