کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
378911 659234 2012 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Narrative-based taxonomy distillation for effective indexing of text collections
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Narrative-based taxonomy distillation for effective indexing of text collections
چکیده انگلیسی

Taxonomies embody formalized knowledge and define aggregations between concepts/categories in a given domain, facilitating the organization of the data and making the contents easily accessible to the users. Since taxonomies have significant roles in data annotation, search and navigation, they are often carefully engineered. However, especially in domains, such as news, where content dynamically evolves, they do not necessarily reflect the content knowledge. Thus, in this paper, we ask and answer, in the positive, the following question: “is it possible to efficiently and effectively adapt a given taxonomy to a usage context defined by a corpus of documents?”In particular, we recognize that the primary role of a taxonomy is to describe or narrate the natural relationships between concepts in a given document corpus. Therefore, a corpus-aware adaptation of a taxonomy should essentially distill the structure of the existing taxonomy by appropriately segmenting and, if needed, summarizing this narrative relative to the content of the corpus. Based on this key observation, we propose A Narrative Interpretation of Taxonomies for their Adaptation (ANITA) for re-structuring existing taxonomies to varying application contexts and we evaluate the proposed scheme using different text collections. Finally we provide user studies that show that the proposed algorithm is able to adapt the taxonomy in a new compact and understandable structure.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 72, February 2012, Pages 103–125
نویسندگان
, , ,