Find Any Word in a Seed Phrase (with Code!)

Find Any Word in a Seed Phrase (with Code!)

If you're reading this first, go to Part 1: Find the Last Word in a Seed Phrase. It's essential to understand why the last word is important in determining any word of your seed phrase.

Now assuming you have read the first part, we're going modify our code to account for a missing word in any position.

Finding Any Word in a Seed Phrase

We will be working with the same seed phrase as last time, except the first word is missing:

? entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

Step 1: Starting New

We're going to start new. Delete any code you've done. Let's start with a blank slate. Your windows should look like this:

python ide

Step 2: Storing our Seed Phrase

We're going to save our seed phrase in variable seed_phrase and make it a list type variable. This is exactly like our previous code except the first word is a ? instead of the last.

Add this first part to your code:

#seed phrase separated by spaces
seed_phrase = "? entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby"

#converts seed phrase into a list to be able to interface with each word individually.
seed_phrase = seed_phrase.split(" ")

Step 3: Importing Our Word List

Again we're going to save our BIP39 word list (that we have in our "english.txt" file in the folder) to variable english as a list. This portion is exactly the same as our previous code.

Add this to your code:

#opens the "english.txt" file and stores it into variable "english"
english = open("english.txt")

#reads the "english.txt" file stored in variable "english" and stores the words in the variable "word_list". Also, changes the variable type to a list.
word_list = english.read().split("\n")

#closes the "english.txt" file stored in variable "english" since we don't need it anymore.
english.close()

Step 4: Seed Phrase to Indexed Numbers and Binary

Again, this portion is exactly the same as previous. We're taking our seed phrase and changing it to their indexed number in the BIP39 word list. Then we're going to change those numbers into binary.

As a reminder, each word in the BIP39 wordlist represents an indexed number (0 - abandon... 2047 - zoo). Each of those numbers can be turned into an 11 bit number (00000000000 - 0, 11111111111 - 2047)

Add this to your code:

#converts seed_phrase (with words) to indexed number in BIP39 wordlist
seed_phrase_index = [word_list.index(word) if word != "?" else word for word in seed_phrase]

#converts seed_phrase_index (with numbers) to binary
seed_phrase_binary = [format(number, "011b") if number != "?" else number for number in seed_phrase_index]

print(seed_phrase_binary)

This will be the result if you run your code thus far:

['?', '01001011100', '11001101011', '11100010101', '10001101011', '11001110111', '11000100110', '11000000011', '01101010110', '10010101000', '11011101101', '11000111111', '11001000001', '01110110100', '11001001100', '11101101011', '00000100000', '10100010011', '10101001100', '11001101101', '10000010110', '01010101001', '00101111001', '00010001000']

Step 5: Calculating the Missing Bits

So once we have our binary seed phrase, we need to figure out the number of bits we need at the end. Again, this piece of code is exactly the same.

Add this part:

#calculates the number of bits missing for entropy
num_missing_bits = int(11-(1/3)*(len(seed_phrase)))

Step 6: Approach to Finding Any Word (NOTE: THIS IS WHERE IT IS DIFFERENT FROM THE PREVIOUS ENTRY)

Up until here, this is where our code will deviate from our previous entry.

We're trying to be able to find any word of a seed phrase. Thus, it could be the first, second, third, etc. or it could be the last word.

From our previous entry, we learned that the last word is calculated based on the previous words.

So if the last word was missing, we could simply calculate what it could be based on the previous (23) words.

If a different word was missing, we'd have to check all of the 2048 of the words in the BIP39 wordlist in that position, calculate the last word and see if it matches.

For example in our seed phrase, the first word is missing. We'd have to try all 2048 words starting with abandon and calculate the last word.

We'd then have to check if the last word matches the actual last word baby. If it's not a match, we'd try again with the next word.

So that's essentially what we're going to do but automate it with code.

In summary, our approach is:

  • If (the last word is missing)
    • Calculate the last word based on the previous 23 words
  • Else (if the last word is NOT missing)
    • Try each 2048 words in the missing position
    • Check if the calculated last word matches the actual last word

Step 6: Calculating the Possible Missing Bits for a Word

The missing word can be any 2048 of the words in the BIP39 wordlist. Thus, it can be any combination of 11 bits.

This code below calculates all the possible permutations of 11 bits, 00000000000 to 11111111111, and stores it in possible_word_bits.

Add this to your code below.

#calculates all the possible bits for a missing word
possible_word_bits = [bin(x)[2:].rjust(11, "0") for x in range(2**11)]

Step 7: If...Else

We've got to check which position the missing word is in. 

As we stated earlier, based on which position word is missing, we have take two approaches.

So, we're going to utilize something called an if else statement. This allows us to enact different parts of code based on a specific condition.

In our case, the specific condition is whether the last word is missing or not.

Let's start with if the last word is missing.

Step 8: If...

    If the last word is missing, we can do what we did before in the previous entry.

    First we check if the last position of seed_phrase_binary is equal to "?".

    As a reminder entropy actually consists of

    • Each of the 11 bits of the first 23 words (for a total of 253)
    • Plus the first 3 bits of the last word.

    For a total of 256 bits. This is the required amount of bits in order to found our checksum (the last 8 bits) for our 24 word seed phrase.

    But since we're missing the last word, that means we're missing some bits (3 to be exact to make 256 bits for a 24 word seed phrase). So where going to save every combination of 3 bits (from 000...111) into missing_bits_possible.

    We also need to calculate the checksum. We're going to create a new variable, checksum, but set it empty ("") since we haven't calculated it yet.

    Lastly, we're going to save our seed phrase in binary format (without the last word) in a variable called entropy_less_missing_bits_possible. This would be the first 253 bits.

    #this checks if there is a last word and saves information in variables depending on if there is a last word or not.
    if seed_phrase_binary[-1] == "?": #if the last word is missing
        
        #calculates all the possible permutation of missing bits for entropy and saves it into missing_bits_possible
        missing_bits_possible = [bin(x)[2:].rjust(num_missing_bits, "0") for x in range(2**num_missing_bits)]
    
        #saves nothing into checksum
        checksum = ""
    
        #saves the entropy into entropy_less_missing_bits_possible
        entropy_less_missing_bits_possible =  ["".join(seed_phrase_binary[:-1])]
    

    Step 9: Else

    Else, another word is missing and there is a last word. The last word contains some valuable data we need to save.

    Remember, we don't care necessarily about the word itself, but the 11 bit number that word represents.

    We want the first bits (in this case first 3 bits for a 24 word seed phrase) saved into what would be our missing_bits_possible. There is only one possibility since we do have a last word.

    That way, when we try every single 11 bit combination for the missing word, we can fill in the last 3 bits to get 256 bit entropy for our 24 word seed phrase required to calculate the checksum.

    We also want the rest of the bits (the last 8 bits for a 24 word seed phrase) saved into our checksum. We're going to use this later to check against each calculated checksum to see if they match.

    And finally, we're going replace the missing word position with every combination ("permutation") of 11 bits (2048 possibilities) and save each of the first 23 word combinations (253 bits) into entropy_less_missing_bits_possible

    else: #if the last word is not missing
        
        #saves the first bits into missing_bits_possible
        missing_bits_possible = [seed_phrase_binary[-1][0:num_missing_bits]]
        
        #saves the last bits into checksum
        checksum = seed_phrase_binary[-1][-(11-num_missing_bits):]
    
        #calculates all the possible entropies (without last word) with every 11 digit bit number in the missing word location and saves it into entropy_less_missing_bits_possible
        entropy_less_missing_bits_possible = ["".join(word if word != "?" else word_bit for word in seed_phrase_binary[:-1]) for word_bit in possible_word_bits]
    

    If you'd like to test this portion of the code, add this to your code:

    print(missing_bits_possible)
    
    print(checksum)
    
    print(len(entropy_less_missing_bits_possible))
    

    "len( )" gives you the length of the variable. In other words, how many "things" in stored in that variable. You should get this result:

    ['000']
    10001000
    2048

    Since your last word is "baby" and that that index number is 136. Thus, in 11 digit binary it is 00010001000. The first 3 bits, "000", is your missing_bits_possible. The last 8 bits, "10001000", is your checksum.

    Since you're replacing the first word with each 2048 word, you should have 2048 possibilities in entropy_less_missing_bits_possible.

    Step 9: Completing Our Entropy

    Now, we're going to add each possible last 3 missing bits to our 253 bits possibilities to complete our 256 bits.

    Since we're not missing the last word, that means there is only one potential last 3 missing bits ("000" in this case)

    But remember, if we were missing the last word, there would be 8 possibilities (000, 001...111) for this 24 word seed phrase.

    Add this to your code:

    #this will add the missing bits we saved (if there is last word) or calculated (if there is no last word) earlier.
    entropy_possible = [bit_combination + missing_bits for missing_bits in missing_bits_possible for bit_combination in entropy_less_missing_bits_possible]
    

    Step 10: Calculating and Comparing the Checksums

    Now that we have our 256 bit entropy possibilities, we're going to calculate our checksums by inputting each one into the SHA256 function.

    This code below will calculate each checksum and compare it to the one we have saved earlier.

    If it is a match, we know it is a valid seed phrase and will save it into seed_phrase_binary_possible. If it is not a match, it won't save it.

    #inputs each entropy_possible in the SHA256 function to result in the corresponding checksum. It then will compare it to the checksum we saved earlier.
    #if it is a match, then it will add it on to the end of the entropy and save it to variable seed_phrase_binary_possible.
    #or if there was no saved checksum (i.e. no last word) then it will add all the potential checksums to the end
    import hashlib
    
    seed_phrase_binary_possible = [entropy + calc_checksum for entropy in entropy_possible if checksum == (calc_checksum := format(hashlib.sha256(int(entropy, 2).to_bytes(len(entropy) // 8, byteorder="big")).digest()[0],"08b")[:11-num_missing_bits]) or checksum == ""]
    

    If you'd like to test this portion of the code, add this to your code:

    print(len(seed_phrase_binary_possible))
    

    Run it and this will be the result:

    10

    There are 10 possible valid seed phrases.

    Step 11: Determining the Missing Word

    Finally, we are going to convert this binary number back to words to get the potential seed phrases.

    Add this to your code:

    
    #this will save convert the binary seed phrase into their indexed numbers, then into the words
    seed_phrase_word_possible = (" ".join([word_list[int(binary[i:i+11],2)] for i in range(0, len(binary), 11)]) for binary in seed_phrase_binary_possible)
    
    print(*seed_phrase_word_possible, sep = "\n\n")
    

    This will be the result:

    crawl entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    element entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    insane entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    million entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    rally entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    release entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    roof entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    symptom entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    timber entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby
    
    weapon entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby

    The first word is different for each one (since it was the missing word).

    Putting It All Together

    We've got the potential missing words of our seed phrase! This code words with only one missing word in any position (including the last).

    Here is the entirety of the code we discussed above:

    #seed phrase separated by spaces
    seed_phrase = "? entire sniff tired miracle solve shadow scatter hello never tank side sight isolate sister uniform advice pen praise soap lizard festival connect baby"
    
    #converts seed phrase into a list to be able to interface with each word individually.
    seed_phrase = seed_phrase.split(" ")
    
    #opens the "english.txt" file and stores it into variable "english"
    english = open("english.txt")
    
    #reads the "english.txt" file stored in variable "english" and stores the words in the variable "word_list". Also, changes the variable type to a list.
    word_list = english.read().split("\n")
    
    #closes the "english.txt" file stored in variable "english" since we don't need it anymore.
    english.close()
    
    #converts seed_phrase (with words) to indexed number in BIP39 wordlist
    seed_phrase_index = [word_list.index(word) if word != "?" else word for word in seed_phrase]
    
    #converts seed_phrase_index (with numbers) to binary
    seed_phrase_binary = [format(number, "011b") if number != "?" else number for number in seed_phrase_index]
    
    #calculates the number of bits missing for entropy
    num_missing_bits = int(11-(1/3)*(len(seed_phrase)))
    
    #calculates all the possible bits for a missing word
    possible_word_bits = [bin(x)[2:].rjust(11, "0") for x in range(2**11)]
    
    #this checks if there is a last word and saves information in variables depending on if there is a last word or not.
    if seed_phrase_binary[-1] == "?": #if the last word is missing
        
        #calculates all the possible permutation of missing bits for entropy and saves it into missing_bits_possible
        missing_bits_possible = [bin(x)[2:].rjust(num_missing_bits, "0") for x in range(2**num_missing_bits)]
    
        #saves nothing into checksum
        checksum = ""
    
        #saves the entropy into entropy_less_missing_bits_possible
        entropy_less_missing_bits_possible =  ["".join(seed_phrase_binary[:-1])]
    
    else: #if the last word is not missing
    
        #saves the first bits into missing_bits_possible
        missing_bits_possible = [seed_phrase_binary[-1][0:num_missing_bits]]
        
        #saves the last bits into checksum
        checksum = seed_phrase_binary[-1][-(11-num_missing_bits):]
    
        #calculates all the possible entropies (without last word) with every 11 digit bit number in the missing word location and saves it into entropy_less_missing_bits_possible
        entropy_less_missing_bits_possible = ["".join(word if word != "?" else word_bit for word in seed_phrase_binary[:-1]) for word_bit in possible_word_bits]
    
    #this will add the missing bits we saved (if there is last word) or calculated (if there is no last word) earlier.
    entropy_possible = [bit_combination + missing_bits for missing_bits in missing_bits_possible for bit_combination in entropy_less_missing_bits_possible]
    
    #inputs each entropy_possible in the SHA256 function to result in the corresponding checksum. It then will compare it to the checksum we saved earlier.
    #if it is a match, then it will add it on to the end of the entropy and save it to variable seed_phrase_binary_possible.
    #or if there was no saved checksum (i.e. no last word) then it will add all the potential checksums to the end
    import hashlib
    
    seed_phrase_binary_possible = [entropy + calc_checksum for entropy in entropy_possible if checksum == (calc_checksum := format(hashlib.sha256(int(entropy, 2).to_bytes(len(entropy) // 8, byteorder="big")).digest()[0],"08b")[:11-num_missing_bits]) or checksum == ""]
    
    #this will save convert the binary seed phrase into their indexed numbers, then into the words
    seed_phrase_word_possible = (" ".join([word_list[int(binary[i:i+11],2)] for i in range(0, len(binary), 11)]) for binary in seed_phrase_binary_possible)
    
    print(*seed_phrase_word_possible, sep = "\n\n")
    

    We can try each one of them to see which one results in a balance. A bit more to try out (10 possibilities) but it's not impossible.

    But what if we had more possibilities — hundreds of them if not more. How do we check all of them. One by one? Yes, but we're going to code something to check each one for us.

    We'll discuss that in the next part of our series: Find a Used Seed Phrase

    Back to blog