Article ID Journal Published Year Pages File Type
6937362 Computer Vision and Image Understanding 2018 30 Pages PDF
Abstract
Generating ground truth to design object detectors for large geographic areas covered by hundreds of satellite images poses two major challenges: one algorithmic and the other rooted in human-computer interaction considerations. The algorithmic challenge relates to minimizing the human annotation burden by collecting only those ground truth samples that are likely to improve the classifier. And the human-computer interaction challenge relates to the temporal latencies associated with scanning all the images to find those ground truth samples and eliciting annotations from a user. We address the algorithmic challenge by using the now well-known concepts from Active Learning, albeit with a significant departure from how Active Learning has traditionally been presented in the literature: we present a human-operated active learning framework, rather than relying on previously collected fully labeled datasets for simulated experiments. And, we address the human-computer interaction challenge by using a distributed approach that relies on multiple virtual machines working in parallel to carry out randomized scans in different portions of the geographic area in order to generate the active-learning based samples for human annotation. We demonstrate our wide-area framework for two infrequently occurring objects over large geographic areas in Australia. One is for detecting pedestrian crosswalks in a region that spans 180,000 sq. km, and the other is for detecting power transmission-line towers in a region that spans 150,000 sq. km. Using randomly selected unseen regions for measuring detector performance, the crosswalk detector works with 92% precision and 72% recall, and the transmission-line tower detector with 80% precision and 50% recall.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,