Article ID Journal Published Year Pages File Type
5627440 Clinical Neurophysiology 2017 8 Pages PDF
Abstract

•Inter-reader and algorithm versus human agreement for spike detection using pairwise comparisons.•A statistical Turing test evaluates for algorithm noninferiority versus skilled human performance.•The Persyst 13 spike detection algorithm proved noninferior to a set of three skilled human readers.

ObjectiveCompare the spike detection performance of three skilled humans and three computer algorithms.Methods40 prolonged EEGs, 35 containing reported spikes, were evaluated. Spikes and sharp waves were marked by the humans and algorithms. Pairwise sensitivity and false positive rates were calculated for each human-human and algorithm-human pair. Differences in human pairwise performance were calculated and compared to the range of algorithm versus human performance differences as a type of statistical Turing test.Results5474 individual spike events were marked by the humans. Mean, pairwise human sensitivities and false positive rates were 40.0%, 42.1%, and 51.5%, and 0.80, 0.97, and 1.99/min. Only the Persyst 13 (P13) algorithm was comparable to humans - 43.9% and 1.65/min. Evaluation of pairwise differences in sensitivity and false positive rate demonstrated that P13 met statistical noninferiority criteria compared to the humans.ConclusionHumans had only a fair level of agreement in spike marking. The P13 algorithm was statistically noninferior to the humans.SignificanceThis was the first time that a spike detection algorithm and humans performed similarly. The performance comparison methodology utilized here is generally applicable to problems in which skilled human performance is the desired standard and no external gold standard exists.

Related Topics
Life Sciences Neuroscience Neurology
Authors
, , ,