کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4953339 1443005 2017 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Designing image segmentation studies: Statistical power, sample size and reference standard quality
ترجمه فارسی عنوان
طراحی مطالعات تقسیم بندی تصویر: قدرت آماری، اندازه نمونه و کیفیت استاندارد مرجع
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر گرافیک کامپیوتری و طراحی به کمک کامپیوتر
چکیده انگلیسی


- A sample size calculation for segmentation accuracy studies is derived.
- Parameters include accuracy difference, algorithm disagreement and a design factor.
- A formula is derived to account for errors in the study reference standard.
- A case study illustrates the application of the theory to a segmentation study design.

Segmentation algorithms are typically evaluated by comparison to an accepted reference standard. The cost of generating accurate reference standards for medical image segmentation can be substantial. Since the study cost and the likelihood of detecting a clinically meaningful difference in accuracy both depend on the size and on the quality of the study reference standard, balancing these trade-offs supports the efficient use of research resources.In this work, we derive a statistical power calculation that enables researchers to estimate the appropriate sample size to detect clinically meaningful differences in segmentation accuracy (i.e. the proportion of voxels matching the reference standard) between two algorithms. Furthermore, we derive a formula to relate reference standard errors to their effect on the sample sizes of studies using lower-quality (but potentially more affordable and practically available) reference standards.The accuracy of the derived sample size formula was estimated through Monte Carlo simulation, demonstrating, with 95% confidence, a predicted statistical power within 4% of simulated values across a range of model parameters. This corresponds to sample size errors of less than 4 subjects and errors in the detectable accuracy difference less than 0.6%. The applicability of the formula to real-world data was assessed using bootstrap resampling simulations for pairs of algorithms from the PROMISE12 prostate MR segmentation challenge data set. The model predicted the simulated power for the majority of algorithm pairs within 4% for simulated experiments using a high-quality reference standard and within 6% for simulated experiments using a low-quality reference standard. A case study, also based on the PROMISE12 data, illustrates using the formulae to evaluate whether to use a lower-quality reference standard in a prostate segmentation study.

251

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Medical Image Analysis - Volume 42, December 2017, Pages 44-59
نویسندگان
, , , ,