Statistics and Learning Theory
Distributed Learning Performance Analysis

Crowdsourced Classification with XOR Queries: An Algorithm with Optimal Sample Complexity

Daesung Kim, Hye Won Chung

Date & Time

01:00 am – 01:00 am


We consider the crowdsourced classification of $m$ binary labels with XOR queries that ask whether the number of objects having a given attribute in the chosen subset of size $d$ is even or odd. The subset size $d$, which we call query degree, can be varying over queries. Since a worker needs to make more efforts to answer a query of a higher degree, we consider a noise model where the accuracy of worker's answer changes depending both on the worker reliability and query degree $d$. For this general model, we characterize the information-theoretic limit on the optimal number of queries to reliably recover $m$ labels in terms of a given combination of degree-$d$ queries and noise parameters. Further, we propose an efficient inference algorithm that achieves this limit even when the noise parameters are unknown.


Session Chair

Lav Varshney

University of Illinois, Urbana-Champaign