Statistics and Learning Theory
L.3.3
Lecture
Distributed Learning Performance Analysis

Crowdsourced Classification with XOR Queries: An Algorithm with Optimal Sample Complexity

Daesung Kim, Hye Won Chung

Date & Time

01:00 am – 01:00 am

Abstract

We consider the crowdsourced classification of $m$ binary labels with XOR queries that ask whether the number of objects having a given attribute in the chosen subset of size $d$ is even or odd. The subset size $d$, which we call query degree, can be varying over queries. Since a worker needs to make more efforts to answer a query of a higher degree, we consider a noise model where the accuracy of worker's answer changes depending both on the worker reliability and query degree $d$. For this general model, we characterize the information-theoretic limit on the optimal number of queries to reliably recover $m$ labels in terms of a given combination of degree-$d$ queries and noise parameters. Further, we propose an efficient inference algorithm that achieves this limit even when the noise parameters are unknown.


Presenters

Session Chair

Lav Varshney

University of Illinois, Urbana-Champaign