Understanding Your Bitcoin Keys: Bip39 Seed Words

The bedrock of Bitcoin self-sovereignty is having control over your private keys. Without this, in one way or another, you are relinquishing control of your money to someone else. “Not your keys, not your coins” as the saying goes. A counter-intuitive aspect of Bitcoin for people who aren’t familiar with the technical underpinnings of it is “where” your Bitcoin actually is. When people think of a wallet, they think “the place where I keep my money.” Your bitcoin wallet doesn’t actually “hold” your Bitcoin, it just stores your private keys. Your Bitcoin is just entries of data on the blockchain hosted by everyone participating in the network. When you go to spend your bitcoin, what you are actually doing is proposing an update to the data stored on the blockchain. A private key is how the protocol ensures that you, and you alone, can authorize an update to the blockchain that spends your Bitcoin.

So what are your private keys? Just very large numbers. Extremely large. This is a private key in binary:

1110001011011001011110111100000101000100000010001001111010111011010101110111001111111111101010111010010111010011101001110010100110111101000110000111110101111001101001011110011011101000001101101101110001101000110001111010001001001111011010101011001101101010

256 random 1s and 0s. This random number is what ultimately secures your Bitcoin. It might not look like much, but its randomness is what ensures your wallet’s security. There are almost as many possible Bitcoin private keys as there are atoms in the visible universe. That is how many numbers a computer would have to count through to generate and catalog all the private keys potentially possible. As long as the process used to generate the keys is truly random, your keys are safe.

This is what a private key looks like in hexadecimal (binary uses two digits to encode a number, 1 and 0, hexadecimal uses 16 digits, 0-9 and A-F):

E2D97BC144089EBB5773FFABA5D3A729BD187D79A5E6E836DC68C7A24F6AB36A

This is what a private key looks like in uncompressed Wallet Import Format (WIF):

5KYC9aMMSDWGJciYRtwY3mNpeTn91BLagdjzJ4k4RQmdhQvE98G

WIF format is how everyone used to interact with their private keys in the early days of Bitcoin. In this era, you could generate one private key at a time, and then you’d generate the public key from that. The process of generating a public key is essentially just the multiplication of very large numbers but there is a bit more to it than that.. All public keys are an x and y point on a graph showing a very, very large curve that loops back on itself.

On the graph curve, in Bitcoin’s case Secp256k1, there is a point called the “generator point.” This generator point can be thought of as the “base point” on the Secp256k1 curve. It is integral to the process of generating keys and signing with them. This is what the generator point is for Bitcoin’s curve:

G = 02 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798

To generate the public key from your private key, you take the private key you generated and multiply it by the generator point. That’s it. This now establishes a point on the graph with a mathematical relationship to the private key you generated that only you know.

This is an uncompressed public key showing both x and y points:

04C0E410A572C880D1A2106AFE1C6EA2F67830ABCC8BBDF24729F7BF3AFEA06158F0C04D7335D051A92442330A50B8C37CE0EC5AFC4FFEAB41732DA5108261FFED

It is very common to “compress” public keys in the rare chance you interact with them to just store the x coordinate with a byte to tell you whether the y coordinate is negative or positive. That shortens it considerably:

04C0E410A572C880D1A2106AFE1C6EA2F67830ABCC8BBDF24729F7BF3AFEA06158F0C04D7335D051A92442330A50B8C37CE0EC5AFC4FFEAB41732DA5108261FFED

When you go to sign a transaction with your private key, it once again boils down to essentially just multiplication. By generating a random number (the nonce), and using that and your private key to essentially multiply the hash of the transaction you are signing, you produce the signature (which is made up of two values, r, and S). This allows someone to run an algorithm to verify the message was signed by the appropriate private key without revealing that key. The thing guaranteeing only you can authorize spending your Bitcoin is essentially just the multiplication of very, very large numbers.

If you aren’t all that familiar with these concepts before reading this, all of this probably seems somewhat intimidating. Binary? Hexadecimal? Graph points? How do you back up a WIF?

Since the development of more intuitive ways of handling this data, most users are unfamiliar with these complicated formats. Most likely, you have more experience with word seeds, also known as seed phrases.

BIP 39 Mnemonic Seeds

Mnemonic seeds, or seed phrases, were created to address the problem of the experience of interacting with your private keys.

As we discussed earlier, private keys are ultimately just a long series of 1s and 0s that are randomly generated. Imagine trying to create copies of this and ensure you didn’t make an error transcribing it:

1110001011011001011110111100000101000100000010001001111010111011010101110111001111111111101010111010010111010011101001110010100110111101000110000111110101111001101001011110011011101000001101101101110001101000110001111010001001001111011010101011001101101010

All it would take is a single error copying one digit to render a backup of your keys useless. This is where mnemonic seeds come in handy. 256 consecutive 1s and 0s in a row is not a human-friendly way to interact with sensitive information. Recording this number incorrectly means losing access to your account.

truck renew fury donkey remind laptop reform detail split grief because fat

That is much easier to deal with, isn’t it? Just 12 words. So how does that work, going from a bunch of random 1s and 0s to a string of words that actually make sense to you? An encoding scheme, just like binary or hexadecimal!

Each of those 12 words in that mnemonic seed above is a binary number in an encoding scheme mapping specific strings of 1s and 0s to words. If we look back at the WIF private key example earlier, that was simply a number encoded in a specific encoding scheme, in that case, base 58, which uses every number and letter of the alphabet except 0 and 1, and O and l (case sensitive). The exclusion of those characters was done specifically to make transcription errors unlikely by confusing a 1 for an l, or a 0 for an O. bech32 and bech32m used by Segwit and Taproot take this to the next level by using only this set of characters (qpzry9x8gf2tvdw0s3jn54khce6mua7l).

Bitcoin Improvement Proposal 39 (BIP 39), introduced a standardized encoding scheme where each word in a specially crafted dictionary is alphabetically mapped to a binary number from 00000000001 to 11111111111. The demonstration seed above maps to this:

truck: 11101001001

renew: 10110110001

fury: 01011110011

donkey: 01000001001

remind: 10110101110

laptop: 01111101000

reform: 10110100010

detail: 00111100010

split: 11010010001

grief: 01100110100

because: 00010011110

fat: 01010011011

In just binary it looks like this:

11101001001 10110110001 01011110011 01000001001 10110101110 01111101000 10110100010 00111100010 11010010001 01100110100 00010011110 0101001 1011

There are 2048 words, each mapped to a specific 11 digit string of 1s and 0s, specifically to make it easier for people to interact with their private keys. When you generate a random number for your private key, your wallet cuts that number up into chunks of 11 digit binary numbers and maps them to the BIP 39 Mnemonic dictionary. It’s still the same large number, but now you can read it as English words. Since your brain is much more accustomed to this format than long strings of 1s and 0s, this drastically reduces the odds of you writing down something wrong and losing your Bitcoin in the process.

You may have noticed that in the raw binary encoding of the word seed above, there are four digits (1011) sitting off on their own, and the last “word” is only actually 8 digits. That is a checksum to ensure that a seed phrase is valid. When you generate your random number, there aren’t enough digits to map it exactly to 12 (or 24) words. The wallet hashes those existing digits you generated and takes the first few digits of the hash to add on to the end of your random number. This gives you enough digits to map to the last word.

This last word allows you to perform a safety check on copies of your seed. If you enter your mnemonic seed into a wallet incorrectly, the checksum will not match. Each 12 or 24 word seed has multiple potential valid checksum words, but if the last word doesn’t match the checksum of a correct seed your wallet will warn you it is invalid. This gives people an intuitive yet still mathematical way to guarantee their backups are correct, unlike the messy process of transcribing and backing up the raw binary numbers.

The selection of the specific words on the list even went so far as to guarantee that none of the 2048 words have the same first four letters. This was done to reduce the likelihood of people making transcription errors by confusing similar words and winding up with an incorrect backup of their private keys.

Translating these words into a set of multiple private/public keys is quite simple. Your mnemonic seed is taken and hashed using SHA512, which outputs a hash of 512 individual 1s and 0s. Half of that output is used as an actual private key, and the other half is used as input to SHA512 with an index number and the existing private or public key to generate a new key pair. You can do this as many times as you want to generate new private/public keys that can all be recovered from your single mnemonic phrase.

This ensures that you can manage your private keys as easily, and safely, as possible with the lowest odds of making a mistake that loses your money. And all of it was done using math! Hopefully, now you have a good understanding of why people say that Bitcoin is money ‘secured by math.’



No comments: