کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
385620 660869 2011 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Combining multiple disambiguation methods for gene mention normalization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Combining multiple disambiguation methods for gene mention normalization
چکیده انگلیسی

The rapid growth of biomedical literature prompts pervasive concentrations of biomedical text mining community to explore methodology for accessing and managing this ever-increasing knowledge. One important task of text mining in biomedical literature is gene mention normalization which recognizes the biomedical entities in biomedical texts and maps each gene mention discussed in the text to unique organic database identifiers. In this work, we employ an information retrieval based method which extracts gene mention’s semantic profile from PubMed abstracts for gene mention disambiguation. This disambiguation method focuses on generating a more comprehensive representation of gene mention rather than the organic clues such as gene ontology which has fewer co-occurrences with the gene mention. Furthermore, we use an existing biomedical resource as another disambiguation method. Then we extract features from gene mention detection system’s outcome to build a false positive filter according to Wikipedia’s retrieved documents. Our system achieved F-measure of 83.1% on BioCreative II GN test data.

Research highlights
► We employ an information retrieval based semantic profile method for gene mention normalization.
► We integrate Wikipedia resource with our other methods for improving the result.
► We achieve F-measure of 83.1% on BioCreative II GN test data.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 38, Issue 7, July 2011, Pages 7994–7999
نویسندگان
, , , ,