Parallel mining of OWL 2 EL ontology from large linked datasets

Article ID	Journal	Published Year	Pages	File Type
402276	Knowledge-Based Systems	2015	8 Pages	PDF

Abstract

Linked Data has become a vast repository with billions of triples available in thousands of datasets. One of the challenges in integrating, querying and reusing the Linked Data is obtaining the ontology to which the datasets conform. Although many ontologies are built manually, many RDF (Resource Description Framework) datasets are still published without any prescribed schema. In this study, we propose a parallel ontology mining approach. Ontology axioms are obtained through statistical measures by running SPARQL queries. To improve efficiency, large Linked Data is divided into blocks based on the connectivity of property graphs. Mining process is then executed on parallel computing units. The division method conforms that mining results from the parallel computing units are complete and correct. Evaluations are performed on two kinds of DBpedia datasets, namely, Mapping-based Dataset with ontology and Raw Infobox Dataset without ontology and the results show the effectivity and efficiency of our approach.

Keywords

RDF DBpedia Linked data