|کد مقاله||کد نشریه||سال انتشار||مقاله انگلیسی||ترجمه فارسی||نسخه تمام متن|
|377536||658788||2016||10 صفحه PDF||سفارش دهید||دانلود رایگان|
این مقاله ISI می تواند منبع ارزشمندی برای تولید محتوا باشد.
- تولید محتوا برای سایت و وبلاگ
- تولید محتوا برای کتاب
- تولید محتوا برای نشریات و روزنامه ها
پایگاه «دانشیاری» آمادگی دارد با همکاری مجموعه «شهر محتوا» با استفاده از این مقاله علمی، برای شما به زبان فارسی، تولید محتوا نماید.
• Integration of domain knowledge and learning algorithm led to increased interpretability of predictive models while predictive performance is not affected significantly.
• A quantitative analysis of interpretability is given based on information loss caused by dimensionality reduction.
• The method is evaluated and analysed for hospital readmission prediction for SID pediatric patient data in California.
• Interpretations of models comply with existing medical understanding of pediatric readmission.
ObjectivesQuantification and early identification of unplanned readmission risk have the potential to improve the quality of care during hospitalization and after discharge. However, high dimensionality, sparsity, and class imbalance of electronic health data and the complexity of risk quantification, challenge the development of accurate predictive models. Predictive models require a certain level of interpretability in order to be applicable in real settings and create actionable insights. This paper aims to develop accurate and interpretable predictive models for readmission in a general pediatric patient population, by integrating a data-driven model (sparse logistic regression) and domain knowledge based on the international classification of diseases 9th—revision clinical modification (ICD-9-CM) hierarchy of diseases. Additionally, we propose a way to quantify the interpretability of a model and inspect the stability of alternative solutions.Materials and methodsThe analysis was conducted on >66,000 pediatric hospital discharge records from California, State Inpatient Databases, Healthcare Cost and Utilization Project between 2009 and 2011. We incorporated domain knowledge based on the ICD-9-CM hierarchy in a data driven, Tree-Lasso regularized logistic regression model, providing the framework for model interpretation. This approach was compared with traditional Lasso logistic regression resulting in models that are easier to interpret by fewer high-level diagnoses, with comparable prediction accuracy.ResultsThe results revealed that the use of a Tree-Lasso model was as competitive in terms of accuracy (measured by area under the receiver operating characteristic curve—AUC) as the traditional Lasso logistic regression, but integration with the ICD-9-CM hierarchy of diseases provided more interpretable models in terms of high-level diagnoses. Additionally, interpretations of models are in accordance with existing medical understanding of pediatric readmission. Best performing models have similar performances reaching AUC values 0.783 and 0.779 for traditional Lasso and Tree-Lasso, respectfully. However, information loss of Lasso models is 0.35 bits higher compared to Tree-Lasso model.ConclusionsWe propose a method for building predictive models applicable for the detection of readmission risk based on Electronic Health records. Integration of domain knowledge (in the form of ICD-9-CM taxonomy) and a data-driven, sparse predictive algorithm (Tree-Lasso Logistic Regression) resulted in an increase of interpretability of the resulting model. The models are interpreted for the readmission prediction problem in general pediatric population in California, as well as several important subpopulations, and the interpretations of models comply with existing medical understanding of pediatric readmission. Finally, quantitative assessment of the interpretability of the models is given, that is beyond simple counts of selected low-level features.
Journal: Artificial Intelligence in Medicine - Volume 72, September 2016, Pages 12–21