Our friend’s voice was shaking over the phone: “I think I lost all of my bitcoins”. Knowing he was holding a lot of bitcoins for a long time (“HODLing”), we knew the loss was significant ( a six figure sum).
He explained that he used a hardware wallet stored in a safe and forgot the pin code for it. Therefore he bought a new hardware wallet and tried to recover using his seed phrase (a list of 24 words, also known as “seed mnemonic”, “backup phrase”, “mnemonic phrase”). But this was not working because the 10th word was wrong. It looked like “stab” but the device insisted it was wrong. Our friends’ thumbs were exhausted from typing all kind of variants of this word, but to no avail.
“Don’t worry” we replied, “we’re here for you. We will do our best to recover your phrase and money”.
Before we can describe our solution to recover the seed phrase we need to understand the way it functions.
The seed phrase is an encoding into words of a binary string. The binary string consists of a random value (also known as “entropy”) and checksum. The checksum is a function operated on the random value and its result is appended to the string, so that it would be easy to identify errors in the string, much like the final digit of a credit card number. The result is a list of words that encodes both a random value and checksum.
This encoding was introduced because it is easier (but not easy, as we saw) to write down a few words than a long string of alphanumeric characters. The BIP-39 Bitcoin standard describes exactly how to encode this value into words in several languages.
The BIP-32 standard describes the algorithms and processes of securely deriving multiple private-public key pairs and addresses from a single entropy value, thus creating hierarchical deterministic (HD) wallets. The need for multiple addresses comes from the fact that different addresses are needed for different blockchains or to create new addresses within the same blockchain to protect the privacy of the users.
There are a few standards that implement the concepts of BIP-32 to create specific derivations. For example, BIP-44 is used to derive Ethereum and regular Bitcoin addresses, BIP-49 is used for Bitcoin Segwit compatibility addresses and BIP-84 is used for native Bitcoin Segwit addresses.
Now that we understand the technical details of seed phrases, we can gain the following insights for the recovery of the misspelled seed phrase:
We used the insights above to create a tool to find our friend’s mystery seed word and recover his funds. Our tool tries to replace the misspelled word with all of the BIP-39 words and check if the checksum is valid. If the seed phrase was found to be valid, it derives all of the possible Bitcoin addresses and automatically checks for their transaction records using a block explorer service.
Using the tool, we were able to recover the correct seed phrase within a matter of seconds! By the way, the word was “slab”.
While that sounds pretty straightforward, we had all kinds of setbacks on our way. For example, due to other misunderstandings in reading our friend’s handwriting we had mistakes in correctly copying the seed words to our program. Additionally, we only supported the traditional encoding of Bitcoin addresses (BIP-44) initially, and were very disappointed to find out none of the resulting addresses were correct. It took us some time to realize it could be a Segwit address (BIP-49). Only when we added that option to our tool we finally came up with right address and keys.
We felt quite good being able to help our friend and restore his funds. But then we thought of all the other users of cryptocurrency that may encounter a similar problem and have no programmer friends that can help.
So we decided to open-source the Seed Savior tool for the community. The tool is based on Ian Coleman’s work that already contained the needed algorithmics and code to derive addresses from a mnemonic phrase. We added the option to specify an unknown word by entering a question mark (“?”) instead of the word. The tool outputs the relevant word and its derived addresses in most Bitcoin and Ethereum supported derivation standards.
To maximize user security we recommend following Ian Coleman’s advice: save the HTML page of the tool and operate it while offline. To support this we removed the automatic check against online block explorers and the addresses are now only a link to the explorer.
There is a growing understanding that the seed phrase solution cannot be part of the future of cryptocurrency for consumers. The seed phrase creates a bad onboarding experience when the user needs to spend several minutes writing down and verifying the 12 or 24 words, and certainly is problematic for recovery. We at KZen see seed phrases as a major problem for mass adoption of cryptocurrency and building a solution to dramatically simplify that problem. But until seed phrases become obsolete, we hope this tool will help to ease some of the pain associated with them.