Natural language of uncertainty: numeric hedge words

Article ID	Journal	Published Year	Pages	File Type
396968	International Journal of Approximate Reasoning	2015	21 Pages	PDF

Abstract

•One system of expressing uncertainty in natural languages involves approximators.•Approximators are linguistic hedges such as “about” or “more than”, with a number.•We used Amazon Mechanical Turk to decode quantitative meanings of approximators.•Human interpretations vary widely, but there may be as few as three kinds of hedges.•Hedge word choice interacts with the magnitude and roundness of the nominal quantity.

An important part of processing elicited numerical inputs is an ability to quantitatively decode natural-language words that are commonly used to express or modify numerical values. These include ‘about’, ‘around’, ‘almost’, ‘exactly’, ‘nearly’, ‘below’, ‘at least’, ‘order of’, etc., which are collectively known as approximators or numerical hedges. Figuring out the quantitative implications of these expressions for the uncertainty of numerical quantities is important for being able to understand, for example, what is actually being reported by a patient who says a headache has lasted for “about 7 days”, and how we should translate the patient's report into uncertainty about the duration. We used Amazon Mechanical Turk to empirically identify the implications of various approximators common in English. To evaluate the numerical range implied by each approximator, we analyzed paired statements differing only in the approximator used in numerical expressions. Despite often considerable diversity, there were several statistically significant findings, but far less quantitative variation implied by the approximators than might have been expected. The numerical implication of different approximators interacts with the magnitude and roundness of the nominal quantity. This investigation strategy generalizes easily to languages other than English.

Keywords

Uncertainty communication Amazon Mechanical Turk Elicitation Hedge