On the relationship between training sample size and data dimensionality: Monte Carlo analysis of broadband multi-temporal classification

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
10114069	1621381	2005	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Dimensionality - ابعاد Maximum likelihood - حداکثر احتمال Time-series - سری زمانی Crop classification - طبقه بندی محصول Training sample - نمونه آموزش Multi-temporal - چند منظوره

موضوعات مرتبط

مهندسی و علوم پایه علوم زمین و سیارات کامپیوتر در علوم زمین

پیش نمایش صفحه اول مقاله

On the relationship between training sample size and data dimensionality: Monte Carlo analysis of broadband multi-temporal classification

چکیده انگلیسی

The number of training samples per class (n) required for accurate Maximum Likelihood (ML) classification is known to be affected by the number of bands (p) in the input image. However, the general rule which defines that n should be 10p to 30p is often enforced universally in remote sensing without questioning its relevance to the complexity of the specific discrimination problem. Furthermore, identifying this many training samples is often problematic when many classes and/or many bands are used. It is important, then, to test how this generally accepted rule matches common remote sensing discrimination problems because it could be unnecessarily restrictive for many applications. This study was primarily conducted in order to test whether the general rule defining the relationship between n and p was well-suited for ML classification of a relatively simple remote sensing-based discrimination problem. To summarise the mean response of n-to-p for our study site, a Monte Carlo procedure was used to randomly stack various numbers of bands into thousands of separate image combinations that were then classified using an ML algorithm. The bands were randomly selected from a 119-band Enhanced Thematic Mapper-plus (ETM+) dataset comprised of 17 images acquired during the 2001-2002 southern hemisphere summer agricultural growing season over an irrigation area in south-eastern Australia. Results showed that the number of training samples needed for accurate ML classification was much lower than the current widely accepted rule. Due to the asymptotic nature of the relationship, we found that 95% of the accuracy attained using nÂ =Â 30p samples could be achieved by using approximately 2p to 4p samples, or â¤Â 1Â /Â 7th the currently recommended value of n. Our findings show that the number of training samples needed for a simple discrimination problem is much less than that defined by the general rule and therefore the rule should not be universally enforced; the number of training samples needed should also be determined by considering the complexity of the discrimination problem.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Remote Sensing of Environment - Volume 98, Issue 4, 30 October 2005, Pages 468-480

نویسندگان

Thomas G. Van Niel, Tim R. McVicar, Bisun Datt,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

On the relationship between training sample size and data dimensionality: Monte Carlo analysis of broadband multi-temporal classification

دسترسی سریع

ارتباط

English Website