Here is a problem I was thinking about today: how to use a six-sided die to generate a seed phrase for a blockchain wallet. The underlying problem is that 6 = 2 × 3, and that factor of 3 messes things up.
What is a seed phrase?
To generate addresses for a blockchain like Bitcoin, you need a secret random number that no one can guess. The number should be about 128 bits long, so that’s a lot of 1s and 0s to remember. As almost no one is capable of such a memory feat, and because if one of the digits is written down incorrectly, you won’t be able to digitally sign transactions properly, Bitcoin Improvement Proposal 39 (BIP39) was proposed.
Under BIP39, a dictionary of 2048 standard words was generated, starting with the word abandon, and ending with the word zoo.
2048 is 2¹¹, so each word represents eleven bits (for example, abandon is 00000000000, kitten is 01111011010, and zoo is 11111111111). If you have a list of 12 words, you have 11 × 12 = 132 bits, so the last 4 bits are used as a checksum for the prior 128 bits, with those first 128 bits being the master key used to derive all the subsequent private keys in your blockchain wallet.
You end up with something like: carpet baby bicycle betray shift approve barrel phrase measure prevent image brand.
The list of words produces the random number from which all the private keys of your blockchain wallet are derived. That means that one seed phrase can be used to secure all your digital assets, provided you keep it safe, keep it secret, and select it in a truly random manner.
Why would anyone want to generate one manually?
If you throw dice or toss coins to generate your private key or seed phrase, then you avoid the risk of the seed phrase generated by your wallet not being truly random due to a hardware or software fault in the random number generator, either accidentally or maliciously.
Why is that important?
I discovered the other day that three paper wallets I created back in 2014, each of which I sent 0.01 bitcoin to, were generated with a flawed online paper wallet generator. I can’t even remember which website I used, but the private keys turned out to be 0x01, 0x02, and 0x03.
In today’s prices that’s 1800€ lost, but at the time it was only 9€, so I’m not crying into my dinner.
Back to the original problem, though: throwing dice to generate seed phrases.
Tossing a coin versus throwing a die
To generate a 12-word seed phrase you could toss a coin 128 times, writing down 1 for heads and 0 for tails, but that is an awful lot of work to generate randomness.
An eight-sided die generates 3 bits for each throw (1 = 000, 2 = 001, 3 = 010, 4 = 011, 5 = 100, 6 = 101, 7= 110, and 8 = 111). This reduces the amount of effort to throwing the die 43 times to produce the seed phrase, or 42 times plus a couple of coin tosses.
However, most people are not avid role-playing gamers and therefore don’t have an eight-sided die conveniently to hand.
The six-sided die problem
Most people do have access to some six-sided dice, for example, from a Monopoly or Yahtzee set. There is an upper bound to the number of throws: counting an even number as a 0 and an odd number as a 1, we need 12 × 11 - 4 = 128 throws.
Flawed method one
Throwing a number between 1 and 6 and writing down a corresponding bit pattern gives us the following index:
Although it looks like we can generate the required number of bits in 43 throws (128 ÷ 3, rounded up), this means that every third bit is two times more likely to be a 0 than a 1, and the same for the bit after every third bit.
Our selection of our random number is not uniformly distributed. The numbers we generally will usually contain far more 0s than 1s.
Flawed method two
2047 in base 10 is 13251 in base six. Another simplistic approach is to throw the die five times, writing down each result minus one. That will give us a number from 00000 to 55555 in base six (between 0 and 7776 in decimal), uniformly distributed. We can then accept the number if it is less or equal to 13251 in base six, or throw again if we get a number over 13251 in base six. Do this twelve times, and we have our seed phrase.
But this is terribly inefficient: we will be discarding almost three-quarters of the numbers we throw away, and that means we don’t know how many times we’ll have to repeat our “throw the die five times” exercise. The second problem is that the process is not guaranteed to terminate.
There’s about a 74% chance we’ll have to throw a number away for being too big.
Our expected number of throws per word is:
If I’ve done my calculations correctly converges to a number just under 19. The total expected number of throws is therefore 12 × 19 = 228, which is worse than the even/odd method.
Flawed method three
Perhaps we can “scale” the values? If we’re getting something between 0 and 7776 (non-inclusive), but we want something between 0 and 2048 (non-inclusive), why not multiply by 2048, divide by 7776, then round the value down to the nearest integer?
Unfortunately, that doesn’t work. We end up with a finite set of possible fractional parts, which leads to a substantial bias in the distribution if the number of faces the die has is a factor that is not a factor of the number being sought. This is the case with our six-sided die: powers of two do not contain a factor of 3.
For example, imagine we have an unusual seven-sided die that can create numbers between 0 and 6 (inclusive), but we want numbers between 0 and 3 (inclusive).
As you can see, some of the resulting digits occur more often than others, so it is not uniform. The six-sided die thrown five times isn’t as skewed as this, but it’s still bad.
Method three fails.
Method four
The best method I have found so far is to use the following table:
If you throw a 1, 2, 3, or 4, you get two bits. If you throw a 5 or a 6, you only get 1 bit. In the worst case, if you only throw fives or sixes, it will take 128 throws to generate the number, although in that case, you should really be checking if your dice are loaded.
As on average, two-thirds of your throws will be below 5, you will usually only need about 86 throws.
Conclusion
Unless someone reading this can come up with a simple method that requires less than 86 throws, I recommend that you just buy a set of eight-sided dice off the internet.
You can get ten different colored ones for less than $5.
Then you throw the whole lot four times, and a final throw of three of them gives you the number you need for your seed phrase.