کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
410361 679140 2010 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A modified Markov clustering approach to unsupervised classification of protein sequences
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A modified Markov clustering approach to unsupervised classification of protein sequences
چکیده انگلیسی

In this paper we propose a modified Markov clustering algorithm for efficient and accurate clustering of large protein sequence databases, based on previously evaluated sequence similarity criteria. The proposed modification consists in an exponentially decreasing inflation rate, which aims at helping the quick creation of the hard structure of clusters by using a strong inflation in the beginning, and at producing fine partitions with a weaker inflation thereafter. The algorithm, which was tested and validated using the whole SCOP95 database, or randomly selected 10–50% sections, generally converges within 12–14 iteration cycles and provides clusters of high quality. Furthermore, a novel generalized formula for the inflation operation is given, and an efficient matrix symmetrization technique is presented, in order to improve the partition quality with relatively low amount of extra computations. Finally, an extra speedup is achieved via excluding isolated proteins from further processing. The proposed method performs better than previous solutions, from the point of view of partition quality, and computational load as well.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 73, Issues 13–15, August 2010, Pages 2332–2345
نویسندگان
, , ,