Seed Phrase Recovery Tool: Find a Used Seed Phrase (with Code!)

Seed Phrase Recovery Tool: Find a Used Seed Phrase (with Code!)

This is the third part of the Find My Phrase series in finding a seed phrase (with code!):

You have a bunch of seed phrases you want to check because you were missing a word. It'd be a pain to enter each one and go through the process of recovering each one (especially if you have hundreds to check).

What if I told you, you don't have to. There's a way to check each one automatically.

It'll be helpful to browse these to understand how we got to having multiple seed phrases to check. Also, it's required to read at least Part 1: Find the Last Word in a Seed Phrase for instructions to install Python, the programming language we're going to use to do this.

Now assuming you've read these first two parts or already have Python installed, we're going to get into how to check a seed phrase for usage.

Disclaimer: This is meant to be an educational exercise to utilize programming to explore automation. It is not recommend to do this with your own seed phrase without a secure machine. Entering your seed phrase on a device connected to the internet exposes your seed phrase to potential security threats. If you choose to do so, you fully understand the risks are liable for the consequences.

Finding A Used Seed Phrase

We will be working with the same seed phrase as the previous parts, but with multiple options for the first word:

crawl entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

element entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

insane entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

million entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

rally entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

release entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

roof entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

symptom entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

timber entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

weapon entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

As you can see, the first word is different for each seed phrase.

Our approach is to first determine corresponding addresses for each seed phrase. Then we will check if those address had some activity in the past (since all transactions are recorded on the blockchain).

But, we aren't going to re-invent the wheel. We're going to utilize some existing resources to help us.

Step 1: Installing Libraries

We're going to install something called a "library". A library is a collection of re-usable code that can contain utilities that do specific tasks. By using a library, we may not have to code it from scratch if it already has that capability we're looking for.

In our case, we're going to install two libraries: pycoin and requests.

The first one, pycoin, will let us find the corresponding addresses associated with each seed phrase.

The second one, requests, will let us get information on the activity in those addresses from an application that can read the blockchain.

To install a library in Python, we first have to open the terminal or command prompt:

For instructions to do this on MacOS: https://support.apple.com/guide/terminal/open-or-quit-terminal-apd5265185d-f365-44cb-8b09-71a064a42125/mac

For instructions to do this on Windows:

https://www.lifewire.com/how-to-open-command-prompt-2618089

terminal

Then type:

pip install pycoin requests

type: pip install pycoin requests

Once you're done typing, hit "enter". It should look like the following after installing these two packages:

pycoin requests install

Step 2: Create a PY File

Again as we went over in Part 1, we're going to create another PY file to write our code.

Start with creating a blank text file. (For Mac open the TextEdit application and save it into your folder. For Windows, right-click -> New -> Text Document).

Then rename the file to have a .py extension at the end and press enter. I'm going to name my file, "phrase_check.py".

 create py file phrase_check.py

Step 3: Open the PY File

Again, double-click and open the .py file. When you open the .py file, there will be two windows the pop up. The left-hand window will be showing the results from our code. The right-hand window is where going to begin writing our code.

open phrase_check.py

Step 4: Importing Our Libraries

 In order to use our libraries, we've got to call them out that we're going to use them in our code.

We're going to use the two libraries we just installed, pycoin and requests, and a built-in library (i.e. standard with the Python installation), hashlib.

Add this to the beginning your code:

from pycoin.symbols.btc import network

import requests

import hashlib

As you can see, we do this by using the command import.

For the pycoin library, it's a bit different because we don't need all the utilities in the library. We only need one called "network" that can be found with the command "from" under "symbols" and "btc" in pycoin. Documentation of these specifics can be found at: https://pycoin.readthedocs.io/en/latest/

Step 5: Creating Functions

When checking for usage for a seed phrase there will be a series steps to get from a seed phrase to checking an associated address.

So rather than code everything at once, it'd be easier, cleaner, and more understandable to code each step separately.

These distinct steps can be coded separately into something called functions.

A function is a set of instructions that takes an input and can produce an output.

It'll be easier to understand this by laying out each of our steps.

Step 1: We need go from a seed phrase (input) to its master key (output) via a set of instructions (function)

Step 2: The master key (input) will be used to generate the associated addresses (output)

Step 3: The associated addresses (input) will be checked one by one for any receiving activity and the one with transactions will be returned (output)

Step 4: Based on which address was returned (input), we'll determine which seed phrase it was generated from (output) 

Functions have a specific format in python.

A function is first defined by def followed by the "function_name" of your choosing. Then in the parenthesis is the "inputs" followed by a colon (:). You can have as many inputs as you want

def function_name(input1, input2):

You can even make default values for your inputs by adding a "=" and setting it to something. This will make that input optional since if you don't specify one, it'll default to what you set it to.

def function_name(input1, input2=1):

Afterwards you begin writing your code for your set of instructions that you want to do to the inputs. For example:

output = input1 + input2

At the end, you can return an "output" based on your set of instructions you did to the input:

return output

It would look something like this in its entirety:

def function_name(input1, input2=1):

output = input1 + input2 return output

Essentially what this does is add "input1" and "input2" together and save it in "output". If there is not "input2", it will simply add "1". Then it will return the "output". 

So you can call this function by its name. For example:

function_name(1, 2)

This would return the number "3" since input1=1 and input2=2.

I could also do this:

function_name(1)

The result would be returning the number "2" since I did not specify an "input2".

Step 5: Seed Phrase to Master Key

Let's write our first function. We're going to call it calc_key.

def calc_key(seed_phrase, passphrase):

As you can see there are two inputs, seed_phrase and passphrase. "seed_phrase" is self explanatory. "passphrase" is another input we'll need but is basically an additional optional input". For more information on passphrase, see "What is a BIP39 Passphrase?"

In order to get from a seed phrase to an address, the seed phrase has to go through a series of "transformations" via mathematical calculations to get there. If you're interested, we've written an article "How Does A Seed Phrase Recover All My Cryptocurrency?" to explain the specifics.

The first "transformation" is called a "seed". That's right the "seed phrase" and "seed" are two different things. 

A seed looks like this:

fc196be88b09c22aae644eb2580dab905ff6f7d191a438ce7abf48299b6dadb2e9a23398023d526dc98ea8f9cef1bf0a53546d837812116efb773e3ef05605c2

The seed phrase is sort of a way to represent this long series of letters and numbers because you can imagine it'd be difficult to copy down.

So in order to get from the seed_phrase to a seed, we first need to put it through something called Password Based Key Derivation Function 2 (PBKDF2) along with the optional passphrase.

The PBKDF2 is in one of the libraries we imported earlier, hashlib.

    seed = hashlib.pbkdf2_hmac("sha512",
                                seed_phrase.encode("utf-8"),
                                salt=("mnemonic" + passphrase).encode("utf-8"),
                                iterations=2048, dklen=64)

That looks complicated right? Right. We'll have go through another transformation to transform our seed to get the master_key.

...but luckily the library we imported earlier, pycoin, has the utility to do it for us.

We can call upon the utility by stating what we imported earlier, network and going down keys to get to bip32_seed which will calculate the seed for us.

    master_key = network.keys.bip32_seed(seed)

And finally we want to output that master key from the function we're creating.

    return master_key

So in its entirety, your function should look like the code below. Add this portion of code after your library imports:

def calc_key(seed_phrase , passphrase):
    seed = hashlib.pbkdf2_hmac("sha512",
                                seed_phrase.encode("utf-8"),
                                salt=("mnemonic" + passphrase).encode("utf-8"),
                                iterations=2048, dklen=64)

    master_key = network.keys.bip32_seed(seed)

    return master_key

Step 6: Master Key to Address

Now that we have the master key, we need get to the corresponding addresses.

In reality, the master key goes through a series of calculations before getting to an address. In fact, this master key can generate a virtually infinite amount of addresses.

So, how do we know which addresses we want to get?

That's were something called a "derivation path" comes in. It is essentially a "map" or route to find a specific address via the series of calculations.

 The most common ones that a wallet will default to is one of the four:

  • 0/0
  • 44'/0'/0'/0/0
  • 49'/0'/0'/0/0
  • 84'/0'/0'/0/0

Each of the "/" you see is another calculation in the series. The numbers you see are additional inputs for each of the calculations.

So, we're going to check these four common derivation paths to get four different addresses.

Again luckily the pycoin has this utility and can make it easy for us.

Add this new function, gen_address to your code which inputs the master_key and derivation_path and outputs an address.

def gen_address(derivation_path, master_key):

    subkey = master_key.subkey_for_path(derivation_path)

    hash_160 = subkey.hash160(is_compressed=True)

    if derivation_path[:2] == "49":
        script = network.contract.for_p2pkh_wit(hash_160)
        address = network.address.for_p2s(script)
    elif derivation_path[:2] == "84":
        address = network.address.for_p2pkh_wit(hash_160)
    else:
        address = subkey.address()

    return address

    Step 7: Check Address Activity

    Now that we're able to get the addresses, we need a way to check the blockchain to see if there was any coin sent to these addresses at any point in time.

    We're going to utilize Blockchain's explorer to do that.

    You can actually get information on an address by putting this URL in your browser of choice:

    https://blockchain.info/balance?active=address

    where address is the address of choice. You can even look up multiple address by separating each one by a "|" but has a limit of 100 addresses.

    https://blockchain.info/balance?active=address1|address2|

    For example:

    https://blockchain.info/balance?active=16RCf8jAfz495wTq7umNS8mEv3uNofn6gX|3NiRFNztVLMZF21gx6eE1nL3Q57GMGuunG

    Entering the URL in a browser and hitting enter will result in this data:

    {"16RCf8jAfz495wTq7umNS8mEv3uNofn6gX":{"final_balance":0,"n_tx":6,"total_received":8586},
    "3NiRFNztVLMZF21gx6eE1nL3Q57GMGuunG":{"final_balance":0,"n_tx":0,"total_received":0}}

    For the 16RCf8jAfz495wTq7umNS8mEv3uNofn6gX address:

    The final balance on this address is 0. But, there were 6 transactions ("n_tx") and received 8568 sats (0.00008568 BTC) in total. This means whichever seed phrase this address belongs to has been used at some point in time.

    So this is what we're going to do with our next function:

    1. Input an address_list
      def address_usage(address_list):
      
    2. Add the addresses to the end of the Blockchain URL we mentioned above and save it to address_url.
          address_url = "https://blockchain.info/balance?active="+"|".join(map("|".join, address_list))
    3. Request the data for each addresses by using the requests library and the utility get. Then save that data in variable address_data.
          address_data = requests.get(address_url)
    4. "for" loop will loop through the block of code written below numerous times. In this case, it'll look through all of the addresses in address_data. Using a for loop to check each address_data if the total_recieved is greater than 0 ( > 0 ). Print the address (key), final_balance, total_received, and number of transactions (n_tx).
          for key,value in address_data.json().items():
              if value['total_received'] > 0:
                 address = key
                 print("Address: "+ key)
                 print("Final Balance: " + str(value["final_balance"]))
                 print("Total Recieved: " + str(value["total_received"]))
                 print("Number of Tx: " + str(value["n_tx"]))
                 break
              else:
                  address = ""
    5. Output that address
          return address

    The function its entirety should look like the code below. Add this to your code:

    def address_usage(address_list):
    
        address_url = "https://blockchain.info/balance?active="+"|".join(map("|".join, address_list))
    
        address_data = requests.get(address_url)
        
        for key,value in address_data.json().items():
            if value['total_received'] > 0:
               address = key
               print("Address: "+ key)
               print("Final Balance: " + str(value["final_balance"]))
               print("Total Recieved: " + str(value["total_received"]))
               print("Number of Tx: " + str(value["n_tx"]))
               break
            else:
                address = ""
    
        return address
    

    Step 8: Checking Multiple Seed Phrases

    Now that we're able to check addresses data for one seed phrase, we want to be able to do this for hundreds, thousands, et. of seed phrases.

    We're going to create another function that utilizes the three (3) functions we just made to check multiple seed phrases.

    This is how we're going to do it:

    1. Input a seed_phrase_list, an optional passphrase, and the derivation paths to check.
      • Our optional passphrase will default to empty (i.e. "")
      • Our derivation_path will default to the four we discussed earlier.
        def phrase_usage(seed_phrase_list,
                                passphrase = "",
                                derivation_path = ("0/0",
                                                   "44'/0'/0'/0/0",
                                                   "49'/0'/0'/0/0",
                                                   "84'/0'/0'/0/0")):
    2. Since the max addresses we can check are one-hundred (100) at a time, we want to split our list of seed phrases into chunks and generate/search the addresses corresponding to those chunks of seed phrases such that it does not go above 100 addresses.
      • For example, since we're check 4 addresses per seed phrase, we're only able to check 25 seed phrases at a time (25 * 4 = 100).
      • We're going to set our max_address_limit to 100.
            max_address_limit = 100
      • In order to find our seed_phrase_limit that we can check at a time based on the max_address_limit, we need to divide the "max_address_limit" by the number of addresses we're going to get from each seed phrase. In this case, it corresponds to the number or length (len) of derivation paths (derivation_path) we have.
        seed_phrase_limit = max_address_limit//len(derivation_path)
        
    3. Check the seed_phrase_list in chunks based on the seed_phrase_limit using a for loop. This "for" loop will loop through seed_phrase_limit number of times until the entire seed_phrase_list is exhausted. These are the chunks.
          for i in range(0, len(seed_phrase_list), seed_phrase_limit):
    4. Input the chunk of seed phrases using our calc_key function that will calculate their respective master key and save the output into master_keys.
              master_keys = [calc_key(seed_phrase, passphrase) for seed_phrase in seed_phrase_list[i:i+seed_phrase_limit]]
    5. Input those master_keys using our gen_address function that will generate the addresses based on our derivation paths and save the output into addresses.
              addresses = [[gen_address(path, key) for path in derivation_path] for key in master_keys]
    6. Input those addresses using our address_usage function which will check if there is any transaction activity for those addresses and output the address that had activity. Then, save it into address_found.
              address_found = address_usage(addresses)
    7. If the address_found is not ( != ) empty ( "" ), then look for which number seed phrase (index_match) in the list the "address_found" corresponds to and print the corresponding seed phrase. We can also stop checking the rest of the seed phrases (break).
              if address_found != "":
                      index_match = [i for i, group in enumerate(addresses)if address_found in group][0]
                      print("Seed Phrase: " + seed_phrase_list[i:i+seed_phrase_limit][index_match])
                      break
    8. If address_found continues to be empty, it will continue the "for" loop until the seed_phrase_list is exhausted.
    9. Once the for loop is complete, we want to know that the function is finished. So we'll print complete.
           print("COMPLETE")

     The code in its entirety is below:

    def phrase_usage(seed_phrase_list,
                            passphrase = "",
                            derivation_path = ("0/0",
                                               "44'/0'/0'/0/0",
                                               "49'/0'/0'/0/0",
                                               "84'/0'/0'/0/0")):
        max_address_limit = 100
    
        seed_phrase_limit = max_address_limit//len(derivation_path)
    
        for i in range(0, len(seed_phrase_list), seed_phrase_limit):
            
            master_keys = [calc_key(seed_phrase, passphrase) for seed_phrase in seed_phrase_list[i:i+seed_phrase_limit]]
            
            addresses = [[gen_address(path, key) for path in derivation_path] for key in master_keys]
            
            address_found = address_usage(addresses)
            
            if address_found != "":
                    index_match = [i for i, group in enumerate(addresses)if address_found in group][0]
                    print("Seed Phrase: " + seed_phrase_list[i:i+seed_phrase_limit][index_match])
                    break
        
         print("COMPLETE")
    

    Putting It All Together

    We've got our seed phrase checker!

    Here is the entirety of the code we discussed above:

    from pycoin.symbols.btc import network
    
    import requests
    
    import hashlib
    
    def calc_key(seed_phrase , passphrase):
        seed = hashlib.pbkdf2_hmac("sha512",
                                             seed_phrase.encode("utf-8"),
                                             salt=("mnemonic" + passphrase).encode("utf-8"),
                                             iterations=2048,
                                             dklen=64)
    
        master_key = network.keys.bip32_seed(seed)
    
        return master_key
    
    def gen_address(derivation_path, master_key):
    
        subkey = master_key.subkey_for_path(derivation_path)
    
        hash_160 = subkey.hash160(is_compressed=True)
    
        if derivation_path[:2] == "49":
            script = network.contract.for_p2pkh_wit(hash_160)
            address = network.address.for_p2s(script)
        elif derivation_path[:2] == "84":
            address = network.address.for_p2pkh_wit(hash_160)
        else:
            address = subkey.address()
    
        return address
    
    def address_usage(address_list):
    
        address_url = "https://blockchain.info/balance?active="+"|".join(map("|".join, address_list))
    
        address_data = requests.get(address_url)
        
        for key,value in address_data.json().items():
            if value['total_received'] > 0:
               print("Address: "+ key)
               print("Final Balance: " + str(value["final_balance"]))
               print("Total Recieved: " + str(value["total_received"]))
               print("Number of Tx: " + str(value["n_tx"]))
               break
            else:
                address = ""
    
        return address
        
    def phrase_usage(seed_phrase_list,
                            passphrase = "",
                            derivation_path = ("0/0",
                                                            "44'/0'/0'/0/0",
                                                            "49'/0'/0'/0/0",
                                                            "84'/0'/0'/0/0")):
        max_address_limit = 100
    
        seed_phrase_limit = max_address_limit//len(derivation_path)
    
        for i in range(0, len(seed_phrase_list), seed_phrase_limit):
            
            master_keys = [calc_key(seed_phrase, passphrase) for seed_phrase in seed_phrase_list[i:i+seed_phrase_limit]]
            
            addresses = [[gen_address(path, key) for path in derivation_path] for key in master_keys]
            
            address_found = address_usage(addresses)
            
            if address_found != "":
                    index_match = [i for i, group in enumerate(addresses)if address_found in group][0]
                    print("Seed Phrase: " + seed_phrase_list[i:i+seed_phrase_limit][index_match])
                    break
    
        print("COMPLETE")

    Step 9: Testing it Out

    Let's test out our "phrase_usage" function. We're going to check our ten (10) seed phrases we had earlier.

    First let's define our list of seed phrases ("seed_phrase_possible") and input it into our "phrase_usage" function. Add this to your code:

    seed_phrase_possible = ("crawl entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "element entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "insane entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "million entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "rally entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "release entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "roof entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "symptom entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "timber entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby",
    "weapon entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby")
    
    phrase_usage(seed_phrase_possible)
    

    Save the code and run it.

    You should get this result:

    Address: 16RCf8jAfz495wTq7umNS8mEv3uNofn6gX
    Final Balance: 0
    Total Recieved: 8586
    Number of Tx: 6
    Seed Phrase: element entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    COMPLETE

    We found the address with activity and it corresponds to the seed phrase with the first word "element".

    You can check this for yourself by putting that seed phrase into https://iancoleman.io/bip39/.

    iancoleman

     

     Seeing the address correspond with the first address with the path "m/44'/0'/0'/0/0"

    address
    And looking up that address in the blockchain explorer: https://www.blockchain.com/btc/address/16RCf8jAfz495wTq7umNS8mEv3uNofn6gX
    address lookup

    The Next Step

    Now that we're able to check hundreds, thousands, etc. of seed phrases, its about time we tackle finding a seed phrase with more than one missing words.

    We'll discuss that in the next part of our series: Find a Seed Phrase with Multiple Missing Words.

    Back to blog