Article ID Journal Published Year Pages File Type
7561898 Chemometrics and Intelligent Laboratory Systems 2018 24 Pages PDF
Abstract
As the number of natural and synthesized compounds quickly increases, the management of compounds in large databases is becoming a challenging work. Especially for a compound that can be represented by several different graphs such as benzene can be drawn in single and double bonds, or aromatic bonds. These different graphs are the same for the chemists and are called equivalent chemical graphs or equivalent chemical structures herein (H atoms are found at more than one locations are not called equivalent chemical structures such as enol form and keto form). Researchers mostly hope that each group of equivalent chemical structures (represented by different graphs) could be denoted by the same value, and make them easy to be deal with. For this goal a highly discriminating index ATID (adjacent topology identification), which was derived from a graph theoretical index - 3-EAID, was suggested. In order to reduce the chance that the different compounds were mistaken for duplicate compounds, the uniqueness test of ATID was performed by over 60 million alkanes and over 19 million benzenoids with high similarity, and the results indicate that ATID possesses high discriminating ability. Finally, the ATID was successfully applied to retrieval of duplicate compounds in large databases. The results indicate that the ATID could be a valuable tool for chemical information administration. The ATID can give comparable results with InChI (International Chemical Identifier) dependent on atom number and derived from complicated rules.
Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , ,