Article ID Journal Published Year Pages File Type
396470 Information Systems 2016 19 Pages PDF
Abstract

•A tool producing structured data from product features on the web is introduced.•This is the first Protégé plug-in that extracts product features from web pages.•Extracting information from complex-data intensive web sites is partially handled.•The user creates a template manually using a domain-specific language.•The output is GoodRelations snippets containing product features in RDFa/ Microdata.

This paper introduces a tool that produces structured interoperable data from product features, i.e., attribute name–value pairs, on the web. The tool extracts the product features using a web site-specific template created by the user. The value of the extracted data is maximized by using GoodRelations, which is the standard vocabulary for modeling product types and their features. The final output of the tool is GoodRelations snippets, which contain product features encoded in RDFa or Microdata. These snippets can be embedded into existing static and dynamic web pages in a way accessible to major search engines like Google and Yahoo, mobile applications, and browser extensions. This increases the visibility of your products and services in the latest generation of search engines, recommender systems, and other novel applications.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
,