Abstract
Typical retrieval systems have three requirements: a) Accurate retrieval, ie, the method should have high precision, b) Diverse retrieval, ie, the obtained set of samples should be diverse, and c) Retrieval time should be small. However, most of the existing methods address only one or two of the above mentioned requirements. In this work, we present a method based on randomized locality sensitive hashing which tries to address all of the above requirements simultaneously. While earlier hashing-based approaches considered approximate retrieval to be acceptable only for the sake of efficiency, we argue that one can further exploit approximate retrieval to provide impressive trade-offs between accuracy and diversity. We also extend our method to the problem of multi-label prediction, where the goal is to output a diverse and accurate set of labels for a given document in real-time. Finally, we present empirical …