Article ID Journal Published Year Pages File Type
554733 Decision Support Systems 2014 16 Pages PDF
Abstract

•We present a semantic approach for learning domain taxonomies from text.•Word sense disambiguation is applied on text and on existing taxonomies.•We refine the subsumption method for term relations to include concept semantics.•We define new semantic measures for evaluating the built taxonomies.•Our method performs well for capturing the broader–narrower inter-concept relation.

In this paper we present a framework for the automatic building of a domain taxonomy from text corpora, called Automatic Taxonomy Construction from Text (ATCT). This framework comprises four steps. First, terms are extracted from a corpus of documents. From these extracted terms the ones that are most relevant for a specific domain are selected using a filtering approach in the second step. Third, the selected terms are disambiguated by means of a word sense disambiguation technique and concepts are generated. In the final step, the broader–narrower relations between concepts are determined using a subsumption technique that makes use of concept co-occurrences in a text. For evaluation, we assess the performance of the ATCT framework using the semantic precision, semantic recall, and the taxonomic F-measure that take into account the concept semantics. The proposed framework is evaluated in the field of economics and management as well as the medical domain.

Related Topics
Physical Sciences and Engineering Computer Science Information Systems
Authors
, , ,