Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules
Xie-Ping Gao; Cai-Yan Jia
刊名Journal of Computer Science and Technology
2005
卷号20期号:3页码:309-318
关键词Data Mining Association Rule Frequent Itemset Sample Error Multi-scaling Sampling Large-vocabulary Continuous Chinese Speech Recognition Word Decoding Syllable- Synchronous Network Search Word Segmentation
英文摘要One of the obstacles of the efficient association rule mining is theexplosive expansion of data sets since it is costly or impossible toscan large databases, esp., for multiple times. A popular solution toimprove the speed and scalability of the association rule mining is todo the algorithm on a random sample instead of the entire database. Buthow to effectively define and efficiently estimate the degree of errorwith respect to the outcome of the algorithm, and how to determine the samplesize needed are entangling researches until now. In this paper,an effective and efficient algorithm is given based on the PAC(Probably Approximate Correct) learning theory to measure and estimatesample error. Then, a new adaptive, on-line, fast samplingstrategy --- multi-scaling sampling --- is presented inspired by MRA(Multi-Resolution Analysis) and Shannon sampling theorem, for quicklyobtaining acceptably approximate association rules at appropriate samplesize. Both theoretical analysis and empirical study have showed that thesampling strategy can achieve a very good speed-accuracy trade-off.
语种英语
公开日期2010-11-03
内容类型期刊论文
源URL[http://ictir.ict.ac.cn/handle/311040/825]  
专题中国科学院计算技术研究所期刊论文_2005年英文
推荐引用方式
GB/T 7714
Xie-Ping Gao,Cai-Yan Jia. Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules[J]. Journal of Computer Science and Technology,2005,20(3):309-318.
APA Xie-Ping Gao,&Cai-Yan Jia.(2005).Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules.Journal of Computer Science and Technology,20(3),309-318.
MLA Xie-Ping Gao,et al."Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules".Journal of Computer Science and Technology 20.3(2005):309-318.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace