Article ID Journal Published Year Pages File Type
1179853 Chemometrics and Intelligent Laboratory Systems 2013 14 Pages PDF
Abstract

•This paper proposes the use of fingerprints fragments to build a clustering tree.•Several parameters determining the characteristics of the generated tree.•It is possible to build matrix structures to develop 2D-QSAR models, using the fingerprint pattern tree.

In this paper, we describe an algorithm devised to be used for the extraction of patterns existing in fingerprints. The algorithm takes as input data a data set of molecules represented by its corresponding fingerprints, and generates a set of disjoint patterns also consisting of binary arrays satisfying to subsets, not necessarily disjoint, of the input data set. The algorithm has been developed in Java, allowing its integration in free and proprietary computational chemistry software due to acceptable performance. Fingerprint patterns extracted by the algorithm from molecule data sets are organized in a hierarchical structure where the nodes at each level can be used as a cluster for the classification of the data set. In this paper, we analyze the usefulness of this tree structure and we describe the advantages in the analysis of data sets regarding MCS-based tree structure. Moreover, the use of fingerprint patterns is compared with other representational spaces in the building of QSAR models, showing better results.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , ,