Automatic classification of Tamil documents using vector space model and artificial neural network

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
387849	660911	2009	5 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Artificial neural network model - مدل شبکه عصبی مصنوعی Vector space model - مدل فضای برداری

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Automatic classification of Tamil documents using vector space model and artificial neural network

چکیده انگلیسی

Automatic text classification based on vector space model (VSM), artificial neural networks (ANN), K-nearest neighbor (KNN), Naives Bayes (NB) and support vector machine (SVM) have been applied on English language documents, and gained popularity among text mining and information retrieval (IR) researchers. This paper proposes the application of VSM and ANN for the classification of Tamil language documents. Tamil is morphologically rich Dravidian classical language. The development of internet led to an exponential increase in the amount of electronic documents not only in English but also other regional languages. The automatic classification of Tamil documents has not been explored in detail so far. In this paper, corpus is used to construct and test the VSM and ANN models. Methods of document representation, assigning weights that reflect the importance of each term are discussed. In a traditional word-matching based categorization system, the most popular document representation is VSM. This method needs a high dimensional space to represent the documents. The ANN classifier requires smaller number of features. The experimental results show that ANN model achieves 93.33% which is better than the performance of VSM which yields 90.33% on Tamil document classification.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 36, Issue 8, October 2009, Pages 10914–10918

نویسندگان

K. Rajan, V. Ramalingam, M. Ganesan, S. Palanivel, B. Palaniappan,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Automatic classification of Tamil documents using vector space model and artificial neural network

دسترسی سریع

ارتباط

English Website