In many applications, content accessed by users (movies, videos, news articles, etc.) can leak sensitive latent attributes, such as religious and political views, sexual orientation, ethnicity, gender, and others. To prevent such information leakage, the goal of classical PIR is to hide the identity of the content/message being accessed, which subsequently also hides the latent attributes. This solution, while private, can be too costly, particularly, when perfect (information-theoretic) privacy constraints are imposed. For instance, for single database holding K messages, privately retrieving one message is possible if and only if the user downloads the entire database of K messages. Retrieving content privately, however, may not be necessary to perfectly hide the latent attributes. Motivated by the above, we formulate and study the problem of latent-variable private information retrieval (LV-PIR), which aims at allowing the user efficiently retrieve one out of K messages (indexed by θ) without revealing any information about the latent variable (modeled by S). We focus on the practically relevant setting of a single database and show that one can significantly reduce the download cost of LV-PIR (compared to the classical PIR) based on the correlation between θ and S. We present a general scheme for LV-PIR as a function of the statistical relationship between θ and S, and also provide new results on the capacity/download cost of LV-PIR. Several open problems and new directions are also discussed.


Islam Samy

University of Arizona

Mohamed A. Attia

University of Arizona

Ravi Tandon

University of Arizona

Loukas Lazos

University of Arizona

Session Chair

Mark Wilde

Louisiana State University