Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
15176 | Computational Biology and Chemistry | 2012 | 10 Pages |
In this paper, we propose a method to classify prokaryotic genomes using the agglomerative information bottleneck method for unsupervised clustering. Although the method we present here is closely related to a group of methods based on detecting the presence or absence of genes, our method is different because it uses gene lengths as well. We show that this amended method is reliable. For robustness evaluation, we apply bootstrap and jackknife techniques to input data. As a result, we are able to propose an approach to determine the stability level of a cladogram. We demonstrate that the genome tree produced for a selected small group of genomes looks a lot like a phylogenetic tree of this group.
Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► We present a method of genome clustering using information bottleneck approach. ► Resulted dendrograms are strongly similar to corresponding phylogenetic trees. ► We estimate the robustness of the method proposed by us. ► We show that jackknifing is functional to determine stability of genome classification. ► We conclude that usage of a preprocessing filtering parameter is valuable.