4 Replies - 10871 Views - Last Post: 12 June 2014 - 10:53 PM

#1 Skydiver   User is offline

  • Code herder
  • member icon

Reputation: 7915
  • View blog
  • Posts: 26,425
  • Joined: 05-May 12

Why is Fisher-Yates shuffle good enough?

Post icon  Posted 12 June 2014 - 07:36 AM

After graph theory and calculus, my next biggest weakness in probability and statistics. I'm having a really hard time understanding why the Fisher-Yates shuffle is good enough. I'm asking because I saw Blindman67 post some shuffling code on the C/C++ forum where most of his approaches were O(n2) shuffling techniques, but Fisher-Yates is O(n). I was going to reply that he should simply use Fisher-Yates, but I didn't have a strong enough background in probability to justify why it is good enough.

Implicitly, I understand that flipping a coin 100 times is not going to make the result any more random than just flipping the coin just once. So it follows that randomly picking a cards from a source pile and putting into the target pile until the source pile is exhausted should be random enough. But why do people on insist on shuffling cards at least 3 times, or cutting the deck cards at least once? Is it because most people shuffle by doing a dovetail shuffle, and people with good coordination will come out with perfect dovetails, and therefore a predictable deck? Wouldn't it follow that shuffling 3 times would still be a predictable deck?

Is This A Good Question/Topic? 2
  • +

Replies To: Why is Fisher-Yates shuffle good enough?

#2 baavgai   User is offline

  • Dreaming Coder
  • member icon


Reputation: 7507
  • View blog
  • Posts: 15,558
  • Joined: 16-October 07

Re: Why is Fisher-Yates shuffle good enough?

Posted 12 June 2014 - 08:36 AM

There are two questions here: shuffling in virtual world and shuffling in physical world.

For virtual... I'm afraid I don't have a better understanding than you; nothing formal. However, I'd first posit that your random function is as random as you're going to get. Given this, using that function to deplete a given set would always be best case. Any extra operations offer the potential to introduce non random elements.

For physical... it is what it is. Short of laying the whole deck down and picking up each card individually, which still wouldn't be random, the best you can do is move cards about and hope for the best. And, yes, you can somewhat unshuffle a deck while shuffling it.

There are a number of card tricks that rely on card adjacency to work. That is, you don't know someone's card, but you know the card it was placed next to. A friend was stumped by a varietal of this, and would always ask I perform it for him. He shuffled that deck a whole lot. Still, he somehow rarely managed to break up the paired cards.
Was This Post Helpful? 1
  • +
  • -

#3 BetaWar   User is offline

  • #include "soul.h"
  • member icon

Reputation: 1695
  • View blog
  • Posts: 8,592
  • Joined: 07-September 06

Re: Why is Fisher-Yates shuffle good enough?

Posted 12 June 2014 - 08:48 AM

Based on a little reading over at wikipedia, it looks like dovetail has two forms of "perfect" shuffles which allows for either a. the shuffler to move the top card in 1 place (so there is one card on top of it) or b. preserve the top and bottom cards through the shuffle. So, it stands to reason the thought behind doing the shuffle multiple times it to reduce the likelihood of the shuffler being able to maintain their perfect shuffle through multiple passes in quick succession. However, that won't always hold true. The act of cutting the deck at a random place chosen by someone other than the shuffler allows people to guarantee the top and bottom card have moved thus eliminating much of the benefit of a perfect shuffle.

The Fisher-Yates shuffle technique is actually relatively interesting. Since each card is chosen randomly, you have something like a 1/(52!) chance of knowing the order of the deck afterwards, and all for a linear algorithm.

Dovetail on the other hand would allow for the possibility (assuming you know the initial deck) of knowing the outcome deck with a probability of something along the lines of 1/(2^n) where n is the number of times you shuffle. However, dovetail can also be used to just rotate the internal cards while maintaining the first and last using a perfect shuffle. And the rotation (assuming you know it is happening and are able to interleave the cards one from either side) is cyclical -- thus allowing you to manipulate the deck in to a move favorable position... at least in theory.

1 2 3 4 5 | 6 7 8 9 10

1 6 2 7 3 | 8 4 9 5 10

1 8 6 4 2 | 9 7 5 3 10

1 9 8 7 6 | 5 4 3 2 10


Notice that it starts sorted, then after 3 perfect dovetails you have the first and last card in the same place with the middle cards being in reverse order.

Now, I am not great with probability (in fact that was my worst subject), so the math may be way off, but that's what I see it as at least :)
Was This Post Helpful? 2
  • +
  • -

#4 Skydiver   User is offline

  • Code herder
  • member icon

Reputation: 7915
  • View blog
  • Posts: 26,425
  • Joined: 05-May 12

Re: Why is Fisher-Yates shuffle good enough?

Posted 12 June 2014 - 09:42 PM

Reminds me of one bridge club I visited. Their preferred method of shuffling was to put all the cards face down in the center of the table and everybody massaged them around like mahjong tiles or dominoes for about a minute, and then one person would gather them up into a deck.
Was This Post Helpful? 0
  • +
  • -

#5 jon.kiparsky   User is offline

  • Beginner
  • member icon


Reputation: 12350
  • View blog
  • Posts: 20,984
  • Joined: 19-March 11

Re: Why is Fisher-Yates shuffle good enough?

Posted 12 June 2014 - 10:53 PM

By all accounts, for physical shuffling Persi Diaconis is your man. He says 7 riffle shuffles will randomize the deck.

Wikipedia said:

More precisely, Diaconis showed that, in the Gilbert–Shannon–Reeds model of how likely it is that a riffle results in a particular riffle shuffle permutation, it takes 5 riffles before the total variation distance of a 52-card deck begins to drop significantly from the maximum value of 1.0, and 7 riffles before it drops below 0.5 very quickly (a threshold phenomenon), after which it is reduced by a factor of 2 every shuffle. Interestingly, when entropy is viewed as the probabilistic distance, riffle shuffling seems to take less time to mix, and the threshold phenomenon goes away (because the entropy function is subadditive.).[7]



For virtual shuffling, you have a completely different proposition. The question is really not "why is Fisher-Yates good enough?", it's "why is it better than the obvious alternative?".
I mean, it's good enough because we can prove that it's going to produce a "deck" with maximum disorder, that is, a derangement of the starting configuration in which there is no better than 1/52 chance of predicting the value of the nth card in the sequence for any n. Why is this? It's because Fisher-Yates selects randomly from the remaining deck on each move - that is, the first card in the derangement can be any of the initial cards, with equal probability. The second can be any of the remaining 51, with equal probability, and so forth. Should be obvious that the last card can be any of the 52 with equal probability. Less obvious, but I think it's clear if you work it out, that the likelihood of any card in between being any particular value is exactly 1/52, to the limits of your PRNG. I think it's not hard to see why this is a correct shuffle, if you sharpen a pencil and do the sums.

To make it easy, think about a 4-card deck. Obviously, selecting the first card at random we get a 1/4 chance of any of the four values. Then we select the second card. For each of the four possible first cards, there are three possible choices for the second card, so [2,3,4], [1,3,4], [1,2,4],[1,2,3]. Sorting those, we find that we have again a 1/4 chance of any of the values appearing in the second place. For the third place we have a similar trick, and we end up with 8 chances in 32 for each value to appear in the third slot. And for the last card, I think it's pretty obvious that we have 1/4 chance of any of the possible values in that position. (there are a couple of ways to get this)

But what's the wrong way to do a virtual shuffle? And why is FY better? It's easy enough to state the bad algorithm: it's the obvious one. You iterate over N cards in the deck, and for each card n, you exchange it with some randomly selected card in the deck.
Seems simple enough - why is it wrong? This is a little harder, I think, and if I tried to explain it out of my own mouth, I'd certainly get some details wrong. However, Jeff Atwood has a very good explanation on his blog, and I'll leave it to him. In a nutshell, the problem is based on the fact that some cards get shuffled too much, but read the blog post for the details.

This post has been edited by jon.kiparsky: 12 June 2014 - 11:23 PM

Was This Post Helpful? 4
  • +
  • -

Page 1 of 1