Article ID Journal Published Year Pages File Type
10356131 Journal of Biomedical Informatics 2011 8 Pages PDF
Abstract
Statistical text mining was used to supplement efforts to develop a clinical vocabulary for post-traumatic stress disorder (PTSD) in the VA. A set of outpatient progress notes was collected for a cohort of 405 unique veterans with PTSD and a comparison group of 392 with other psychological conditions at one VA hospital. Two methods were employed: (1) “multi-model term scoring” used stepwise logistic regression to develop 21 separate models by varying three frequency weight and seven term weight options and (2) “iterative term refinement” which used a standard stop list followed by clinical review to eliminate non-clinical terms and terms not related to PTSD. Combined results of the two methods were reviewed by two clinicians resulting in 226 unique PTSD related terms. Results of the statistical text mining methods were compared with ongoing efforts to identify terms based on literature review, focus groups with clinicians treating PTSD and review of an existing vocabulary, lending support to the contributions of the STM analyses.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , , , , ,