J. Huang, C. Guestrin. Uncovering the Riffled Independence Structure of Ranked Data. Electronic Journal of Statistics, Vol. 6 (2012) 1999-230. Communicated July 2010.


Representing distributions over permutations can be a daunting task due to the fact that the number of permutations of n objects scales factorially in n. One recent way that has been used to reduce storage complexity has been to exploit probabilistic independence, but as we argue, full independence assumptions impose strong sparsity constraints on distributions and are unsuitable for modeling rankings. We identify a novel class of independence structures, called riffled independence, encompassing a more expressive family of distributions while retaining many of the properties necessary for performing efficient inference and reducing sample complexity. In riffled independence, one draws two permutations independently, then performs the riffle shuffle, common in card games, to combine the two permutations to form a single permutation. Within the context of ranking, riffled independence corresponds to ranking disjoint sets of objects independently, then interleaving those rankings. In this paper, we provide a formal introduction to riffled independence and propose an automated method for discovering sets of items which are riffle independent from a training set of rankings. We show that our clustering-like algorithms can be used to discover meaningful latent coalitions from real preference ranking datasets and to learn the structure of hierarchically decomposable models based on riffled independence.


title = {Uncovering the Riffled Independence Structure of Ranked Data},
author = {Jonathan Huang and Carlos Guestrin},
journal = {Electronic Journal of Statistics},
volume = {6},
pages = {199-230},
year = {2012},