Multiple alignment-free sequence comparison | |
Ren, Jie ; Song, Kai ; Sun, Fengzhu ; Deng, Minghua ; Reinert, Gesine | |
2013 | |
关键词 | REGULATORY SEQUENCES WORD MATCHES ENHANCERS IDENTIFICATION SIMILARITY |
英文摘要 | Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, C-l* and C-l(S), extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, (C-2*) over bar, <(C-2(S))over bar> and <(C-2(geo))over bar>, averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics.; Biochemical Research Methods; Biotechnology & Applied Microbiology; Computer Science, Interdisciplinary Applications; Mathematical & Computational Biology; Statistics & Probability; SCI(E); PubMed; 1; ARTICLE; 21; 2690-2698; 29 |
语种 | 英语 |
出处 | PubMed ; SCI |
出版者 | bioinformatics |
内容类型 | 其他 |
源URL | [http://hdl.handle.net/20.500.11897/342628] |
专题 | 数学科学学院 |
推荐引用方式 GB/T 7714 | Ren, Jie,Song, Kai,Sun, Fengzhu,et al. Multiple alignment-free sequence comparison. 2013-01-01. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论