Article ID Journal Published Year Pages File Type
5132316 Chemometrics and Intelligent Laboratory Systems 2017 5 Pages PDF
Abstract

•We discussed different alignment-free methods to provide fast, accurate, and scalable solutions to sequence comparison.•To support open-science, facilitate collaboration, and promote research, the platform is implemented as a toolkit using R.•The library msktuple includes locational k-tuple, naive k-tuple, CV-Tree, and their ensembles. It is continuously updated.•We provided GitHub open-source code: https://github.com/saeidamiri1/msktuple/wiki.

Recently alignment-free sequence comparison methods based on promoter-frequency distance measures have gained popularity. This paper reports on the implementation and validation of several alignment-free sequence analysis methods for representing and quantifying between-sequence distances and sequence variability. The msktuple library includes the following sequence comparison techniques: locational k-tuple, naive k-tuple, CV-Tree, and their ensemble variants. These metrics are used to determine the dissimilarity between sequences using k-letter words. In support of open-science, we provide open-source software, R-scripts, and protocols implementing the new techniques. These tools will support collaboration, enable independent validation, promote result reproducibility and enable tool interoperability.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, ,