کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
2076397 | 1079442 | 2010 | 12 صفحه PDF | دانلود رایگان |
We compare several spectral domain based clustering methods for partitioning protein sequence data. The main instrument for this exercise is the spectral density ratio model, which specifies that the logarithmic ratio of two or more unknown spectral density functions has a parametric linear combination of cosines. Maximum likelihood inference is worked out in detail and it is shown that its output yields several distance measures among independent stationary time series. These similarity indices are suitable for clustering time series data based on their second order properties. Other spectral domain based distances are investigated as well; and we compare all methods and distances to the problem of producing segmentations of bacterial outer membrane proteins consistent with their transmembrane topology. Protein sequences are transformed to time series data by employing numerical scales of physicochemical parameters. We also present interesting results on the prediction of transmembrane β-strands, based on the clustering outcome, for a representative set of bacterial outer membrane proteins with given three-dimensional structure.
Journal: Biosystems - Volume 100, Issue 2, May 2010, Pages 132–143