کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
504922 864452 2015 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Maximizing clinical cohort size using free text queries
ترجمه فارسی عنوان
حداکثرسازی اندازه کوهورت بالینی با استفاده از نمایش متن آزاد
کلمات کلیدی
Gingko؛ وارفارین؛ اضافه وزن؛ دیابت؛ پرس و جو متن؛ داده های ساخت یافته. شناسایی همگروه؛ داده های بدون ساختار؛ یادداشت های بالینی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• We demonstrate the value of free text queries in cohort identification.
• Incremental value is added compared to structured data queries alone.
• We determine the value of free text using 3 disparate use cases.
• Use case specific values and limitations are identified in large data sets.
• Exploratory value of a direct search tool in contrast with heavier NLP systems.

BackgroundCohort identification is important in both population health management and research. In this project we sought to assess the use of text queries for cohort identification. Specifically we sought to determine the incremental value of unstructured data queries when added to structured queries for the purpose of patient cohort identification.MethodsThree cohort identification tasks were evaluated: identification of individuals taking gingko biloba and warfarin simultaneously (Gingko/Warfarin), individuals who were overweight, and individuals with uncontrolled diabetes (UCD). We assessed the increase in cohort size when unstructured data queries were added to structured data queries. The positive predictive value of unstructured data queries was assessed by manual chart review of a random sample of 500 patients.ResultsFor Gingko/Warfarin, text query increased the cohort size from 9 to 28,924 over the cohort identified by query of pharmacy data only. For the weight-related tasks, text search increased the cohort by 5–29% compared to the cohort identified by query of the vitals table. For the UCD task, text query increased the cohort size by 2–43% compared to the cohort identified by query of laboratory results or ICD codes. The positive predictive values for text searches were 52% for Gingko/Warfarin, 19–94% for the weight cohort and 44% for UCD.DiscussionThis project demonstrates the value and limitation of free text queries in patient cohort identification from large data sets. The clinical domain and prevalence of the inclusion and exclusion criteria in the patient population influence the utility and yield of this approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers in Biology and Medicine - Volume 60, 1 May 2015, Pages 1–7
نویسندگان
, , , , , , , ,