Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4946149 | Knowledge-Based Systems | 2017 | 34 Pages |
Abstract
This work introduces a novel technique of extracting the main concepts from the text. Concepts are described by word-based connections disposed in a semantic topological space, built by the formal model, the simplicial complex. It links the points, i.e., the words appearing in the text and incrementally creates a geometrical structure, describing concepts that are more or less specialized, depending on the aggregation distance of words. The conceptual network is context-aware, since it reveals unambiguous concepts, specialized by the analysis of the surrounding text. The framework that implements the approach, discovers basic concepts, composed of minimal number of words useful to describe a finite sense concept, and richer extended concepts built adding further relations among terms. The final topological space provides a multi-granule concept representation: from a local, word-closeness view to a highly refined description. Experiments and comparative analysis validate the effectiveness of the approach, evidencing satisfactory performance in the concept identification, with precision values greater than 80% in the most of the experiments and the recall is on average, around 60-70% with peaks of 90% for some specific concept categories.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Danilo Cavaliere, Sabrina Senatore, Vincenzo Loia,