What are Mnemonic Phrases?
The basis for generating keys and addresses is randomness. Ownership on the blockchain is highly secure, as long as the private key cannot be reproduced by an adversary.
The private key is the basis of the public key and address(es). As such, it requires a high level of security, and needs to be generated from a high level of entropy.
Generating Entropy and a Mnemonic Phrase
There are different sources for entropy:
- Hardware
- User input
- Microphone
- And camera inputs.
Preferably, some of the entropy is hardware based and uses more than a single source. Human based entropy is usually less random and therefore less secure.
The initial entropy is used as a seed for a deterministic random bit generator (DRBG), also called pseudorandom number generator (PRNG).
DRBGs should be standardized and validated. One example for a vetted algorithm is the NIST SP800-90A compliant method. The deterministic random bit generator produces a seed, which we then use to generate your mnemonic phrase.
What is a Mnemonic Phrase?
Your mnemonic phrase is a backup of your private key that is used by most wallets, it is a list of random words given to you when creating a wallet.
If you break or lose a device with a wallet - no matter if mobile, desktop or hardware wallet - your mnemonic phrase is usually your last line of defense against a loss of funds.
This implies that any attacker that gets their hands on your recovery phrase will be able to do the same. Therefore, you must protect your mnemonic phrase as well as you would protect your funds themselves.
You should write the phrase of words down on paper or save them in any analog format you see fit.
The mnemonic phrase mechanism that you are likely familiar with was introduced with the Bitcoin Improvement Proposal 39, or BIP-0039.
A mnemonic phrase uses commonly known words, which are used to provide a less error-prone backup mechanism than writing down your private key in a format like:
- Base58
- Hexadecimal
- or Binary
In total, there are 2048 words a mnemonic phrase may comprise, each of them unambiguously identifiable by its first four letters.
Similar words or different versions of the same word are not included, for example:
- build/built
- woman/women
Generating a Mnemonic
- First, a multiple of 32 bits, between 128 and 256 bits total, is generated. Let’s use an example with a 256-bit seed.
- Second, a checksum is calculated by SHA256-hashing the seed. The first x bits of the checksum are appended to the end of the seed and x is the number of initial bits divided by 32. Using a 256-bit seed, this results in an 8 bit checksum.
- Third, the concatenated bits are split into groups of 11 bits each. We get 24 groups for our example.
- Next, each 11 bit group is “translated” into the corresponding word from the list of mnemonic words. 11 bits can encode the values between 0 and 2047, each matching a word in the list of 2048 words.
Now that we have a mnemonic phrase, it needs to be converted into a binary seed.
The PBKDF2 function takes the mnemonic phrase as the password and the string mnemonic + passphrase as the salt.
PBKDF2 takes this input and applies a pseudorandom function - here HMAC-SHA512 - to it, in this case 2048 times. The final output is the derived seed of 512 bits length.
This seed can now be used as an input for different key derivation methods, such as the hierarchical deterministic key derivation.
Key Derivation
Going forward, we'll use the term key generation to refer to the random generation of private keys and key derivation to refer to the path from a private key to the public key and an address.
There are instances when it is desirable to be able to generate different addresses from a single seed or private key. By using a new address for each incoming transaction, as well as to change outputs, your privacy is enhanced because it becomes harder to link transactions to a single user.
There are different approaches to get those addresses.
One could go through the entire process of key generation and derivation for each address, but this would require the user to keep backups for each individual key. This approach is called non-deterministic key generation.
While this approach mitigates the risk of losing all your funds at once in the case you lose a private key and/or mnemonic phrase, it certainly makes for a bad user experience.
Being able to securely produce several child keys from a single parent key creates a better UX in basic wallets, and also allows multi-currency wallets to be backed up with a single mnemonic.
This single mnemonic derives a key pair for each chain.
Hierarchical Deterministic Wallets
Hierarchical Deterministic Wallets were introduced with BIP-0032. The general idea is to take a master private key and use it to generate multiple secure child keys.
Before we dive into different key derivation methods, we need to introduce a concept we’ve omitted thus far.
The 512-bit seed we obtained from our mnemonic phrase by applying the PKKDF2 function is split into two parts:
- A zero-level private key
- and a 256-bit chain code cc
The chain code is used as a key to the HMAC-SHA256 pseudorandom function that takes two pieces of input, the data and a key, to produce a single 512-bit output.
The HMAC-SHA256 function is used at several steps in the key derivation process.
To recap what we did thus far:
- We generated a random bit sequence and used it as an input to a deterministic random bit generator that produced a seed.
- From this seed, we generated a mnemonic phrase, which in turn was converted into a 512-bit binary seed.
- We split our seed into the zero-level private key \(sk_0\) and the zero-level chain code \(cc_0\). A zero-level public key \(PK_0\) is generated from \(sk_0\) using elliptic curve cryptography
Hardened vs. Non-Hardened Secret Key Derivation
When looking at the key derivation in HD wallets, we can generally differentiate between hardened secret key derivation (HSKD) and non-hardened secret key derivation (NSKD).
The difference lies in the inputs used to generate first-level keys.
Additionally, we need to introduce the variable i at this point. It is a 32-bit integer that is used to derive different keys with index i from a single zero-level key.
By convention, the total range of possible values of i is split between HSKD and NSKD, giving both methods the option to create 2147483648 different first-level keys from a single zero-level key.
The hardened secret key derivation takes the zero-level private key \(sk_0\), the zero-level chain code \(cc_0\) and the integer i as an input.
By convention, the top of the integer range, from \(2^{31}\) to \(2^{32}-1\), of i, is used for the hardened secret key derivation method.
In other words, using the HSKD method, the first-level private key \(sk_1\) is a function of three values:
$$ sk_1 = HSKD(sk_0, c_0, i) $$
The non-hardened secret key derivation (NSKD) additionally takes the zero-level public key \(PK_0\) into account. In the non-hardened secret key derivation method, i will by convention take on values between 0 and \(2^{31}-1\), so the lower half of the total range of i.
In other words, using the HSKD method the obtained first-level private key \(sk_1\) is a function of four values:
$$ sk_1 = NSKD(sk_0, PK_0 c_0, i) $$
Hardened Secret Key Derivation - HSKD
The core function that the key derivation is build around is the hash-based message authentication code (HMAC). It is a specific type of message authentication code involving a cryptographic hash function and a secret cryptographic key.
It always uses some hash function, which is then appended to name the exact function. In this specific case, the SHA512 hash function is used and the message authentication code is called HMAC-SHA512.
HMAC consumes several inputs. For our purposes we group them into two: the key and the data.
For the HSKD method, the concatenation of \(sk_0\) and our variable integer i is used as the data and the zero-level chain code \(c_0\) as the key.
The resulting 512-bit output is split into two parts of 256 bits each.
- The first half becomes our first-level private key \(sk_1\)
- The second half our first-level chain code \(cc_1\).
By applying elliptic curve cryptography to \(sk_1\) we can obtain our first-level public key \(PK_1\).
By incrementing i we can generate \(2^{31}\) or 2147483648 different first level private keys and chain codes.
Non-Hardened Secret Key Derivation - NSKD
We can also derive a first level private key using a combination of our zero-level public key \(PK_0\), zero-level secret key \(sk_0\) and our zero-level chain code \(cc_0\).
This method is called Non-Hardened Secret Key Derivation (NSKD).
Again, we will use the HMAC-SHA256 function, but this time we use the concatenation of \(PK_0\) and i as out data and once again \(cc_0\) as our key.
Again, the resulting 512-bit output is split into two parts of 256 bits each. While the second part becomes our first level chain code \(cc_1\) as it did in the HSKD method, the first 256 bits are handled differently.
It doesn't yield our first level private key directly, but instead an addition modulo N with the initial private key \(sk_0\) is performed in order to obtain the first level private key.
Lastly, ECC is used to generate our first level NSKD public key \(PK_1\).
Now imagine a case where a large number of addresses, or public keys respectively, need to be generated on a continuous basis - think a merchant accepting crypto payments.
Both methods described before, HSKD and NSKD need access to the zero-level private key to compute new child keys and hence addresses.
The zero-level private key is also the key that will allow the merchant to spend the money they received, so they should never host this process on the same server that the payment logic resides on. Meet NPKD - the Non-Hardened Public Key Derivation method.
Non-Hardened Public Key Derivation - NPKD
Using the Non-Hardened Public Key Derivation method, our zero level private key \(sk_0\) isn't needed at all. The first steps in the NPKD method are analogous to the NSKD method.
The concatenation of \(PK_0\) and i is used as data and \(cc_0\) as our key. Again, the resulting 512-bit output is split into two parts of 256 bits.
The first 256 bits are used as an input to the ECC module, producing an intermediary result we call \(PRE_1\).
Applying the addition module N together with our zero-level public key \(PK_0\) we get a first level public key \(PK_1\) without having to touch \(sk_0\) at all.
Spending funds received at an address derived from a first-level public key yielded from this method works with \(sk_0\), as the first-level public key obtained using NPKD is equivalent to the first-level public key produced with the NSKD method.
Summary - Mnemonic Phrases
The mnemonic phrase is a list of words that can be used to restore your private key. The mnemonic phrase is as sensible as your private key itself and if it gets in the wrong hands, your money can be stolen.
This should not scare you, but make you cautious.
The first step in generating keys is to produce a random private key. Different sources for entropy generation can be used, and it’s advisable to use several sources of entropy for added security.
Mnemonic phrases are generated by splitting the initial entropy into groups and mapping each group to a word.
There are different ways to derive a public key from a private key, the difference being the type of inputs. Most wallets today are hierarchical deterministic wallets that can create several key pairs from an initial entropy source.
This is good for the user because they only need a single key to generate several separate accounts, even across different blockchains.