Article ID Journal Published Year Pages File Type
423778 Electronic Notes in Theoretical Computer Science 2006 20 Pages PDF
Abstract

One of the most important aspects of a Web document is its up-to-dateness or recency. Up-to-dateness is particularly relevant to Web documents because they usually contain content origining from different sources and being refreshed at different dates. Whether a Web document is relevant for a reader depends on the history of its contents and so-called external factors, i.e., the up-to-dateness of semantically related documents.In this paper, we approach automatic management of up-to-dateness of Web documents that are managed by an XML-centric Web content management system. First, the freshness for a single document is computed, taking into account its change history. A document metric estimates the distance between different versions of a document. Second, up-to-dateness of a document is determined based on its own history and the historical evolutions of semantically related documents.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics