کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
484015 703126 2015 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Clustering and classification of email contents
ترجمه فارسی عنوان
خوشه بندی و طبقه بندی محتویات ایمیل
کلمات کلیدی
طبقه بندی ایمیل ها، شباهت سند، طبقه بندی سند، استخراج ویژگی، طبقه بندی موضوع، طبقه بندی محتوا
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

Information users depend heavily on emails’ system as one of the major sources of communication. Its importance and usage are continuously growing despite the evolution of mobile applications, social networks, etc. Emails are used on both the personal and professional levels. They can be considered as official documents in communication among users. Emails’ data mining and analysis can be conducted for several purposes such as: Spam detection and classification, subject classification, etc. In this paper, a large set of personal emails is used for the purpose of folder and subject classifications. Algorithms are developed to perform clustering and classification for this large text collection. Classification based on NGram is shown to be the best for such large text collection especially as text is Bi-language (i.e. with English and Arabic content).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of King Saud University - Computer and Information Sciences - Volume 27, Issue 1, January 2015, Pages 46–57
نویسندگان
, ,