Article ID Journal Published Year Pages File Type
392773 Information Sciences 2013 14 Pages PDF
Abstract

Web search engines (WSEs) are basic tools for finding and accessing data in the Internet. However, they also put the privacy of their users at risk. This happens because users frequently reveal private information in their queries. WSEs gather this personal data and build user profiles which are used to provide personalized search (PS). PS improves the users’ search results and, hence, it is a key element for the successfulness of WSEs: the entity that offers the best searching experience should attract more users. Nevertheless, profiles can also be used in an improper way by WSEs or they can be stolen by attackers. This situation requires privacy-preserving schemes able to handle from simple queries (one single term) to complex queries (several words with or without relation). Generally, these systems generate and submit inaccurate queries in order to provide privacy, but these queries must be carefully built in order to keep the usefulness of the user profiles. Current literature does not address the generation of privacy-preserving and useful complex queries. Therefore, this paper presents a new scheme that generates distorted user queries from a semantic point of view in order to preserve the usefulness of user profiles. Besides, linguistic analysis techniques are used to properly interpret complex queries performed by users and generate new semantically-related ones accordingly. The performance of the new scheme is evaluated in terms of semantic preservation of new queries, privacy level and runtime. A set of query logs taken from real users and compiled by AOL is used as test data.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,