کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10238703 45682 2005 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
The impact of metadata on the accuracy of automated patent classification
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)
پیش نمایش صفحه اول مقاله
The impact of metadata on the accuracy of automated patent classification
چکیده انگلیسی
During the last decade, the advance of machine-learning tools and algorithms has resulted in tremendous progress in the automated classification of documents. However, many classifiers base their classification decisions solely on document text and ignore metadata (such as authors, publication date, and author affiliation). In this project, automated classifiers using the k-Nearest Neighbour algorithm were developed for the classification of patents into two different classification systems. Those using metadata (in this case inventor names, applicant names and International Patent Classification codes) were compared with those ignoring it. The use of metadata could significantly improve the classification of patents with one classification system, improving classification accuracy from 70.8% up to 75.4%, which was highly statistically significant. However, the results for the other classification system were inconclusive: while metadata could improve the quality of the classifier for some experiments (recall increased from 66.0% to 68.9%, which was a small but nonetheless significant improvement), experiments with different parameters showed that it could also lead to a deterioration of quality (recall dropping as low as 61.0%). The study shows that metadata can play an extremely useful role in the classification of patents. Nonetheless, it must not be used indiscriminately but only after careful evaluation of its usefulness.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: World Patent Information - Volume 27, Issue 1, March 2005, Pages 13-26
نویسندگان
, ,