Article ID Journal Published Year Pages File Type
242420 Advanced Engineering Informatics 2006 13 Pages PDF
Abstract

Retrieval of document fragments has a great potential for application in engineering information management. Frequently engineers have neither the time nor inclination to sift through long documents for small pieces of useful information. Yet it is frequently in the form of one or more long documents that the information that they seek is presented. Supporting the delivery of the right information, in the right format and in the right quantity motivates the search for better ways of handling document sub-components or fragments. Document fragment retrieval can be facilitated using modern computational technologies. This paper proposes a novel framework for information access utilising state-of-the-art computational technologies and introducing the use of multiple document structure views through decomposition schemes. The framework integrates document structure study, mark-up technologies, automated fragment extraction, faceted classification and a document navigation mechanism to achieve the target of retrieval of specific document fragments using precise, complex queries. These disparate elements have been brought together in an exploratory Engineering Document Content Management System (EDCMS). Using this, investigations using representative engineering documents have shown that information users can access and retrieve document content – at fragment level rather than at document level – both through data in a document and document metadata, through different perspectives and at different granularities, and simultaneously across multiple documents as well as within a single document.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , ,