Probability based document clustering and image clustering using content-based image retrieval

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
496047	862848	2013	8 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

CBIR, Content based image retrieval - بازیابی محتوامحور تصویر Document clustering - خوشه بندی مستند Word frequency - فرکانس ورد region of interest - منطقه مورد نظر

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Probability based document clustering and image clustering using content-based image retrieval

چکیده انگلیسی

Clustering of related or similar objects has long been regarded as a potentially useful contribution of helping users to navigate an information space such as a document collection. Many clustering algorithms and techniques have been developed and implemented but as the sizes of document collections have grown these techniques have not been scaled to large collections because of their computational overhead. To solve this problem, the proposed system concentrates on an interactive text clustering methodology, probability based topic oriented and semi-supervised document clustering. Recently, as web and various documents contain both text and large number of images, the proposed system concentrates on content-based image retrieval (CBIR) for image clustering to give additional effect to the document clustering approach. It suggests two kinds of indexing keys, major colour sets (MCS) and distribution block signature (DBS) to prune away the irrelevant images to given query image. Major colour sets are related with colour information while distribution block signatures are related with spatial information. After successively applying these filters to a large database, only small amount of high potential candidates that are somewhat similar to that of query image are identified. Then, the system uses quad modelling method (QM) to set the initial weight of two-dimensional cells in query image according to each major colour and retrieve more similar images through similarity association function associated with the weights. The proposed system evaluates the system efficiency by implementing and testing the clustering results with Dbscan and K-means clustering algorithms. Experiment shows that the proposed document clustering algorithm performs with an average efficiency of 94.4% for various document categories.

Figure optionsDownload as PowerPoint slideHighlights
► High dimensionality was reduced by distinct words.
► The proposed system accepts incoming document at any time.
► The proposed method accepts all categories of documents.
► Training and testing can be done at any time.
► The proposed model concentrates on probability of occurrences of words.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 13, Issue 2, February 2013, Pages 959–966

نویسندگان

M. Karthikeyan, P. Aruna,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Probability based document clustering and image clustering using content-based image retrieval

دسترسی سریع

ارتباط

English Website