QIBEBT-IR  > 单细胞中心组群
The multiple sequence sets: problem and heuristic algorithms
Kang Ning; Hon Wai Leong
2011
发表期刊Journal of Combinatorial Optimization
期号4
摘要
“Sequence set” is a mathematical model used in many applications such as biological sequences analysis and text processing. However, “single” sequence set model is not appropriate for the rapidly increasing problem size. For example, very large genome sequences should be separated and processed chunk by chunk. For these applications, the underlying mathematical model is “Multiple Sequence Sets” (MSS). To process multiple sequence sets, sequences are distributed to different sets and then sequences on each set are processed in parallel. Deriving effective algorithm for MSS processing is challenging.
In this paper, we have first defined the cost functions for the problem of Process of Multiple Sequence Sets (PMSS). The PMSS problem is then formulated as to minimize the total cost of process. Based on the analysis of the features of multiple sequence sets, we have proposed the Distribution and Deposition (DDA) algorithm and DDA* algorithm for PMSS problem. In DDA algorithm, the sequences are first distributed to multiple sets according to their alphabet contents; then sequences in each set are processed by deposition algorithm. The DDA* algorithm differs from the DDA algorithm in that the DDA* algorithm distributes sequences by clustering based on a set of sequence features. Experiments showed that the results of DDA and DDA* are always smaller than other algorithms, and DDA* outperformed DDA in most instances. The DDA and DDA* algorithms were also efficient both in time and space.
;
“Sequence set” is a mathematical model used in many applications such as biological sequences analysis and text processing. However, “single” sequence set model is not appropriate for the rapidly increasing problem size. For example, very large genome sequences should be separated and processed chunk by chunk. For these applications, the underlying mathematical model is “Multiple Sequence Sets” (MSS). To process multiple sequence sets, sequences are distributed to different sets and then sequences on each set are processed in parallel. Deriving effective algorithm for MSS processing is challenging.
In this paper, we have first defined the cost functions for the problem of Process of Multiple Sequence Sets (PMSS). The PMSS problem is then formulated as to minimize the total cost of process. Based on the analysis of the features of multiple sequence sets, we have proposed the Distribution and Deposition (DDA) algorithm and DDA* algorithm for PMSS problem. In DDA algorithm, the sequences are first distributed to multiple sets according to their alphabet contents; then sequences in each set are processed by deposition algorithm. The DDA* algorithm differs from the DDA algorithm in that the DDA* algorithm distributes sequences by clustering based on a set of sequence features. Experiments showed that the results of DDA and DDA* are always smaller than other algorithms, and DDA* outperformed DDA in most instances. The DDA and DDA* algorithms were also efficient both in time and space.
文献类型期刊论文
条目标识符http://ir.qibebt.ac.cn/handle/337004/1083
专题单细胞中心组群
推荐引用方式
GB/T 7714
Kang Ning,Hon Wai Leong. The multiple sequence sets: problem and heuristic algorithms[J]. Journal of Combinatorial Optimization,2011(4).
APA Kang Ning,&Hon Wai Leong.(2011).The multiple sequence sets: problem and heuristic algorithms.Journal of Combinatorial Optimization(4).
MLA Kang Ning,et al."The multiple sequence sets: problem and heuristic algorithms".Journal of Combinatorial Optimization .4(2011).
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
The multiple sequenc(803KB) 开放获取使用许可浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Kang Ning]的文章
[Hon Wai Leong]的文章
百度学术
百度学术中相似的文章
[Kang Ning]的文章
[Hon Wai Leong]的文章
必应学术
必应学术中相似的文章
[Kang Ning]的文章
[Hon Wai Leong]的文章
相关权益政策
暂无数据
收藏/分享
文件名: The multiple sequence sets problem and heuristic algorithms .pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。