دانلود رایگان مقاله: قدرت گواهی متن زبان: یک سیستم مقایسه متن متن قانونی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6462208	1421972	2017	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Strength of linguistic text evidence: A fused forensic text comparison system

ترجمه فارسی عنوان

قدرت گواهی متن زبان: یک سیستم مقایسه متن متن قانونی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

N-grams - N گرم Likelihood ratio - نسبت احتمال

موضوعات مرتبط

مهندسی و علوم پایه شیمی شیمی آنالیزی یا شیمی تجزیه

پیش نمایش مقاله

قدرت گواهی متن زبان: یک سیستم مقایسه متن متن قانونی

چکیده انگلیسی

- A description of estimating strength of authorship attribution evidence within the likelihood ratio framework.
- Efficacy of the likelihood ratio framework for authorship attribution text evidence.
- Efficacy of logistic-regression-fusion for authorship attribution text evidence.
- Effect of data sample size on the performance of the likelihood ratio-based forensic text comparison system.

Compared to other forensic comparative sciences, studies of the efficacy of the likelihood ratio (LR) framework in forensic authorship analysis are lagging. An experiment is described concerning the estimation of strength of linguistic text evidence within that framework. The LRs were estimated by trialling three different procedures: one is based on the multivariate kernel density (MVKD) formula, with each group of messages being modelled as a vector of authorship attribution features; the other two involve N-grams based on word tokens and characters, respectively. The LRs that were separately estimated from the three different procedures are logistic-regression-fused to obtain a single LR for each author comparison. This study used predatory chatlog messages sampled from 115 authors. To see how the number of word tokens affects the performance of a forensic text comparison (FTC) system, token numbers used for modelling each group of messages were progressively increased: 500, 1000, 1500 and 2500 tokens. The performance of the FTC system is assessed using the log-likelihood-ratio cost (Cllr), which is a gradient metric for the quality of LRs, and the strength of the derived LRs is charted as Tippett plots. It is demonstrated in this study that (i) out of the three procedures, the MVKD procedure with authorship attribution features performed best in terms of Cllr, and that (ii) the fused system outperformed all three of the single procedures. When the token length is 1500, for example, the fused system achieved a Cllr value of 0.15. Some unrealistically strong LRs were observed in the results. Reasons for these are discussed, and a possible solution to the problem, namely the empirical lower and upper bound LR (ELUB) method is trialled and applied to the LRs of the best-achieving fusion system.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Forensic Science International - Volume 278, September 2017, Pages 184-197

نویسندگان

Shunichi Ishihara,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : قدرت گواهی متن زبان: یک سیستم مقایسه متن متن قانونی

دسترسی سریع

ارتباط

English Website