Article ID Journal Published Year Pages File Type
534283 Pattern Recognition Letters 2014 6 Pages PDF
Abstract

•NLP-inspired structural pattern recognition was applied to predict chemical activity.•It is combinable with statistical pattern recognition with early or late fusion.•The new method allows searching for “structural alerts” and their combinatorial patterns.•A biodegradability prediction system serves to illustrate the new method.•The source code, data sets and the list of found detected structural alerts are available from the corresponding author on request.

In this paper we report on a new structural pattern recognition approach for in silico prediction of chemical activity. It is based on grammatical inference on strings representing chemical compounds and string edit distance between a chemical compound and a formal grammar generalizing an activity class. In the late 1980s Weininger published a chemical language with a very simple and natural grammar. Recently, the algorithms suitable to process this language have been developed. From modeling of chemical activity with formal grammars and chemical compounds as words, a functionality is derivable to search for “structural alerts”, that is, molecular substructures and their combinatorial patterns that cause a molecule to have properties of interest. A biodegradability prediction system has been constructed to serve as an example throughout the paper. The source code and various files from the experiment are available from the corresponding author on request.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,