Measuring the privacy of user profiles in personalized information systems

Article ID	Journal	Published Year	Pages	File Type
425012	Future Generation Computer Systems	2014	11 Pages	PDF

Abstract

Personalized information systems are information-filtering systems that endeavor to tailor information-exchange functionality to the specific interests of their users. The ability of these systems to profile users is, on the one hand, what enables such intelligent functionality, but on the other, the source of innumerable privacy risks. In this paper, we justify and interpret KL divergence as a criterion for quantifying the privacy of user profiles. Our criterion, which emerged from previous work in the domain of information retrieval, is here thoroughly examined by adopting the beautiful perspective of the method of types and large deviation theory, and under the assumption of two distinct adversary models. In particular, we first elaborate on the intimate connection between Jaynes’ celebrated method of entropy maximization and the use of entropies and divergences as measures of privacy; and secondly, we interpret our privacy metric as false positives and negatives in a binary hypothesis testing.

► We illustrate the privacy risks posed by personalized information systems. ► We justify and interpret KL divergence as a measure of the privacy of user profiles. ► Our metric is examined under the perspective of two distinct adversary models. ► We use Jaynes’ rationale on entropy-maximization methods to justify our criterion. ► Our arguments in favor of divergence also stem from hypothesis testing.

Keywords

Shannon?s entropy Privacy-enhancing technologies user profiling Kullback?Leibler divergence