Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules | |
Xie-Ping Gao; Cai-Yan Jia | |
刊名 | Journal of Computer Science and Technology |
2005 | |
卷号 | 20期号:3页码:309-318 |
关键词 | Data Mining Association Rule Frequent Itemset Sample Error Multi-scaling Sampling Large-vocabulary Continuous Chinese Speech Recognition Word Decoding Syllable- Synchronous Network Search Word Segmentation |
英文摘要 | One of the obstacles of the efficient association rule mining is theexplosive expansion of data sets since it is costly or impossible toscan large databases, esp., for multiple times. A popular solution toimprove the speed and scalability of the association rule mining is todo the algorithm on a random sample instead of the entire database. Buthow to effectively define and efficiently estimate the degree of errorwith respect to the outcome of the algorithm, and how to determine the samplesize needed are entangling researches until now. In this paper,an effective and efficient algorithm is given based on the PAC(Probably Approximate Correct) learning theory to measure and estimatesample error. Then, a new adaptive, on-line, fast samplingstrategy --- multi-scaling sampling --- is presented inspired by MRA(Multi-Resolution Analysis) and Shannon sampling theorem, for quicklyobtaining acceptably approximate association rules at appropriate samplesize. Both theoretical analysis and empirical study have showed that thesampling strategy can achieve a very good speed-accuracy trade-off. |
语种 | 英语 |
公开日期 | 2010-11-03 |
内容类型 | 期刊论文 |
源URL | [http://ictir.ict.ac.cn/handle/311040/825] |
专题 | 中国科学院计算技术研究所期刊论文_2005年英文 |
推荐引用方式 GB/T 7714 | Xie-Ping Gao,Cai-Yan Jia. Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules[J]. Journal of Computer Science and Technology,2005,20(3):309-318. |
APA | Xie-Ping Gao,&Cai-Yan Jia.(2005).Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules.Journal of Computer Science and Technology,20(3),309-318. |
MLA | Xie-Ping Gao,et al."Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules".Journal of Computer Science and Technology 20.3(2005):309-318. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论