CORC  > 北京大学  > 数学科学学院
Multiple alignment-free sequence comparison
Ren, Jie ; Song, Kai ; Sun, Fengzhu ; Deng, Minghua ; Reinert, Gesine
2013
关键词REGULATORY SEQUENCES WORD MATCHES ENHANCERS IDENTIFICATION SIMILARITY
英文摘要Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, C-l* and C-l(S), extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, (C-2*) over bar, <(C-2(S))over bar> and <(C-2(geo))over bar>, averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics.; Biochemical Research Methods; Biotechnology & Applied Microbiology; Computer Science, Interdisciplinary Applications; Mathematical & Computational Biology; Statistics & Probability; SCI(E); PubMed; 1; ARTICLE; 21; 2690-2698; 29
语种英语
出处PubMed ; SCI
出版者bioinformatics
内容类型其他
源URL[http://hdl.handle.net/20.500.11897/342628]  
专题数学科学学院
推荐引用方式
GB/T 7714
Ren, Jie,Song, Kai,Sun, Fengzhu,et al. Multiple alignment-free sequence comparison. 2013-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace