Article ID Journal Published Year Pages File Type
455067 Computers & Electrical Engineering 2013 16 Pages PDF
Abstract

Scrutinizing web resources of interest from a large number of search results is a tedious task for any web user. Fortunately, social sites such as Social Bookmarking Site (SBS) allow web users to store their preferences and searched results of their interest in the form of bookmarks. Such sites however contain lots of irrelevant data as noise and, predicting relevant URLs from the noise is a real challenge. With intent to overcome the challenge, this paper proposes a focused crawler, FCHC that mimics a human cognitive search pattern to find potentially relevant web resources from a SBS. The focused crawler utilizes domain specific Concept Ontology to semantically expand a search topic and to determine Semantic Relevance of tags. The crawler is tested with different search patterns on the ‘database’ domain and evaluated using a well established metric, harvest ratio. The performance of FCHC is analyzed and compared with focused crawlers that crawl the WWW using ontology and, without ontology.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► The paper proposes a Social Semantic Focused Crawler called FCHC. ► The paper also proposes a design of the Concept Ontology (CO). ► It computes Social Semantic Relevance of tagged documents using the CO. ► FCHC is compared with a Semantic Focused Crawler and a Classic Focused Crawler. ► FCHC implemented with three search patterns showed better performance.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , ,