Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6854115 | Engineering Applications of Artificial Intelligence | 2018 | 11 Pages |
Abstract
Email is far more convenient than traditional mail in the delivery of messages. However, it is susceptible to information leakage in business. This problem can be alleviated by classifying emails into different security levels using text mining and machine learning technology. In this research, we developed a scheme in which a neural network is used to extract information from emails to enable its transformation into a multidimensional vector. Email text data is processed using bi-gram to train the document vector, which then undergoes under-sampling to deal with the problem of data imbalance. Finally, the security label of emails is classified using an artificial neural network. The proposed system was evaluated in an actual corporate setting. The results show that the proposed feature extraction approach is more effective than existing methods for the representations of email data in true positive rates and F1-scores.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Jen-Wei Huang, Chia-Wen Chiang, Jia-Wei Chang,