کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
518264 867571 2011 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic figure classification in bioscience literature
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Automatic figure classification in bioscience literature
چکیده انگلیسی

Millions of figures appear in biomedical articles, and it is important to develop an intelligent figure search engine to return relevant figures based on user entries. In this study we report a figure classifier that automatically classifies biomedical figures into five predefined figure types: Gel-image, Image-of-thing, Graph, Model, and Mix. The classifier explored rich image features and integrated them with text features. We performed feature selection and explored different classification models, including a rule-based figure classifier, a supervised machine-learning classifier, and a multi-model classifier, the latter of which integrated the first two classifiers. Our results show that feature selection improved figure classification and the novel image features we explored were the best among image features that we have examined. Our results also show that integrating text and image features achieved better performance than using either of them individually. The best system is a multi-model classifier which combines the rule-based hierarchical classifier and a support vector machine (SVM) based classifier, achieving a 76.7% F1-score for five-type classification. We demonstrated our system at http://figureclassification.askhermes.org/.

In this study, we report a figure classifier that automatically classifies biomedical figures into five predefined figure types: Gel-image, Image-of-thing, Graph, Model, and Mix. We developed a multi-model figure classifier that integrates rule-based figure classifier and supervised machine-learning classifier. The multi-model figure classifier first applied the rule-based classifier to separate figures into texture and nontexture groups, and then applied the rule-based classifier to further separate the texture group Gel-image and Image-of-thing. In contrast, the nontexture group Graph and Model was separated by the supervised machine-learning classifier (specifically, SVM).Figure optionsDownload as PowerPoint slideHighlights
► We developed a figure classifier that automatically classifies biomedical figures into five predefined figure types.
► Feature selection and novel image features improved figure classification.
► Integrating text and image features achieved better performance than using either of them.
► The best system is a multi-model classifier which combines the rule-based hierarchical classifier and a support vector machine (SVM) based classifier.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Biomedical Informatics - Volume 44, Issue 5, October 2011, Pages 848–858
نویسندگان
, , ,