کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
587277 1453303 2016 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Bayesian decision support for coding occupational injury data
ترجمه فارسی عنوان
پشتیبانی تصمیم گیری بیزی برای برنامه نویسی اطلاعات آسیب شغلی
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بهداشت و امنیت شیمی
چکیده انگلیسی


• A semi-automated approach using Bayesian models is proposed for coding injury data.
• Accuracy of proposed approach assuming expert coders was comparable to original manual coding.
• Agreement between different models and prediction strength threshold improve accuracy.
• Top 5 predictions from Naïve Bayes model yield very good accuracy.
• Confusion matrix is useful for identifying misclassifications of rare categories.

IntroductionStudies on autocoding injury data have found that machine learning algorithms perform well for categories that occur frequently but often struggle with rare categories. Therefore, manual coding, although resource-intensive, cannot be eliminated. We propose a Bayesian decision support system to autocode a large portion of the data, filter cases for manual review, and assist human coders by presenting them top k prediction choices and a confusion matrix of predictions from Bayesian models.MethodWe studied the prediction performance of Single-Word (SW) and Two-Word-Sequence (TW) Naïve Bayes models on a sample of data from the 2011 Survey of Occupational Injury and Illness (SOII). We used the agreement in prediction results of SW and TW models, and various prediction strength thresholds for autocoding and filtering cases for manual review. We also studied the sensitivity of the top k predictions of the SW model, TW model, and SW–TW combination, and then compared the accuracy of the manually assigned codes to SOII data with that of the proposed system.ResultsThe accuracy of the proposed system, assuming well-trained coders reviewing a subset of only 26% of cases flagged for review, was estimated to be comparable (86.5%) to the accuracy of the original coding of the data set (range: 73%–86.8%). Overall, the TW model had higher sensitivity than the SW model, and the accuracy of the prediction results increased when the two models agreed, and for higher prediction strength thresholds. The sensitivity of the top five predictions was 93%.ConclusionsThe proposed system seems promising for coding injury data as it offers comparable accuracy and less manual coding.Practical ApplicationsAccurate and timely coded occupational injury data is useful for surveillance as well as prevention activities that aim to make workplaces safer.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Safety Research - Volume 57, June 2016, Pages 71–82
نویسندگان
, , , , ,