دانلود رایگان مقاله: خوشه بندی خودکار اسناد پروژه ساخت و ساز بر اساس شباهت متنی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
246743	502387	2014	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Automatic clustering of construction project documents based on textual similarity

ترجمه فارسی عنوان

خوشه بندی خودکار اسناد پروژه ساخت و ساز بر اساس شباهت متنی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

مدیریت اسناد، خوشه تک تک، روش های یادگیری تحت نظارت / غیر نظار

Document management - مدیریت اسناد

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی مهندسی عمران و سازه

پیش نمایش مقاله

خوشه بندی خودکار اسناد پروژه ساخت و ساز بر اساس شباهت متنی

چکیده انگلیسی

• Hybrid approach for clustering semantically-related project documents is proposed.
• For clustering, inverse relationship between dimensionality & similarity threshold.
• tf–idf weighting method results in high precision, average recall outcomes.
• Refining clustering outcome using supervised learning improves accuracy.
• Textual similarities can be used to reveal semantic relations between documents.

Text classifiers, as supervised learning methods, require a comprehensive training set that covers all classes in order to classify new instances. This limits the use of text classifiers for organizing construction project documents since it is not guaranteed that sufficient samples are available for all possible document categories. To overcome the restriction imposed by the all-inclusive requirement, an unsupervised learning method was used to automatically cluster documents together based on textual similarities. Repeated evaluations using different randomizations of the dataset revealed a region of threshold/dimensionality values of consistently high precision values and average recall values. Accordingly, a hybrid approach was proposed which initially uses an unsupervised method to develop core clusters and then trains a text classifier on the core clusters to classify outlier documents in a consequent refinement step. Evaluation of the hybrid approach demonstrated a significant improvement in recall values, resulting in an overall increase in F-measure scores.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automation in Construction - Volume 42, June 2014, Pages 36–49

نویسندگان

Mohammed Al Qady, Amr Kandil,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : خوشه بندی خودکار اسناد پروژه ساخت و ساز بر اساس شباهت متنی

دسترسی سریع

ارتباط

English Website