کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
456373 695702 2014 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Applicability of Latent Dirichlet Allocation to multi-disk search
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Applicability of Latent Dirichlet Allocation to multi-disk search
چکیده انگلیسی

Digital forensics practitioners face a continual increase in the volume of data they must analyze, which exacerbates the problem of finding relevant information in a noisy domain. Current technologies make use of keyword based search to isolate relevant documents and minimize false positives with respect to investigative goals. Unfortunately, selecting appropriate keywords is a complex and challenging task. Latent Dirichlet Allocation (LDA) offers a possible way to relax keyword selection by returning topically similar documents. This research compares regular expression search techniques and LDA using the Real Data Corpus (RDC). The RDC, a set of over 2400 disks from real users, is first analyzed to craft effective tests. Three tests are executed with the results indicating that, while LDA search should not be used as a replacement to regular expression search, it does offer benefits. First, it is able to locate documents when few, if any, of the keywords exist within them. Second, it improves data browsing and deals with keyword ambiguity by segmenting the documents into topics.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Digital Investigation - Volume 11, Issue 1, March 2014, Pages 43–56
نویسندگان
, ,