کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
516343 1449176 2009 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Learning ontological rules to extract multiple relations of genic interactions from text
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Learning ontological rules to extract multiple relations of genic interactions from text
چکیده انگلیسی

IntroductionInformation extraction (IE) systems have been proposed in recent years to extract genic interactions from bibliographical resources. They are limited to single interaction relations, and have to face a trade-off between recall and precision, by focusing either on specific interactions (for precision), or general and unspecified interactions of biological entities (for recall). Yet, biologists need to process more complex data from literature, in order to study biological pathways. An ontology is an adequate formal representation to model this sophisticated knowledge. However, the tight integration of IE systems and ontologies is still a current research issue, a fortiori with complex ones that go beyond hierarchies.MethodWe propose a rich modeling of genic interactions with an ontology, and show how it can be used within an IE system. The ontology is seen as a language specifying a normalized representation of text. First, IE is performed by extracting instances from natural language processing (NLP) modules. Then, deductive inferences on the ontology language are completed, and new instances are derived from previously extracted ones. Inference rules are learnt with an inductive logic programming (ILP) algorithm, using the ontology as the hypothesis language, and its instantiation on an annotated corpus as the example language. Learning is set in a multi-class setting to deal with the multiple ontological relations.ResultsWe validated our approach on an annotated corpus of gene transcription regulations in the Bacillus subtilis bacterium. We reach a global recall of 89.3% and a precision of 89.6%, with high scores for the ten semantic relations defined in the ontology.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: International Journal of Medical Informatics - Volume 78, Issue 12, December 2009, Pages e31–e38
نویسندگان
, , ,