pybtc icon indicating copy to clipboard operation
pybtc copied to clipboard

Vulnerability in the Shamir Secret Sharing Scheme Implementation

Open emmanueloshin opened this issue 2 years ago • 4 comments

I found vulnerability in the implementation of the scheme that allows an attacker to directly recover some content of share C with only shares A and B in a 3 of 5 threshold scheme. It works in most cases but not all depending on the entropy. If a mnemonic phrase 'S' is split into 3 of 5 shares where shares 'A', 'B' and 'C' are sufficient to restore the secret 'S', if A and B only is used to restore a mnemonic phrase using the mnemonic tool and an output 'C' is generated, then the output 'C' is then put again to replace 'B' and to generate output 'D' and the process is repeated on and on, the outputs C, D,........... formulates the words that consists of C.

For example, S = stuff execute bounce auto brisk orbit creek ticket miracle bleak desk audit This passphrase was spit using the mnemonic tool and using a 3 of 5 threshold scheme in which A, B and C are sufficient to reproduce the secret 'S' where

A = lend green anchor album custom grape repeat easily inflict ring million plate

B = fuel shove embrace track photo truly cart supply action old fancy rent

C = buffalo eagle copy main orient toe brown clump draft negative split ride

When A and B are used to restore memonic we get A + B => D: D = giggle abuse marine emotion stereo onion demand soft found foam wild dust D doesn't have any words that C has but continuing the process

A + D => E

E = invest duty remove science angry crouch bitter target palm buffalo bulk twelve WE HAVE 1 PHRASE 'BUFFALO'

A + E => F

F = quit fun robust viable toe fold city tragic view ladder powder meat WE HAVE ANOTHER PHRASE 'TOE'

A + F => Non unique or invalid shares

D + B => Non unique or invalid shares

E + B => G

G = fiber later dynamic ride below stadium magnet alien lab high bachelor favorite WE HAVE ANOTHER PHRASE 'RIDE'

F+B => Non unique or invalid shares

G+B => Non unique or invalid shares

D+ E => H

H = grab copy pulp large stomach donate gap canoe gloom chase often confirm WE HAVE ANOTHER PHRASE 'COPY'

D + F => Non unique or invalid shares

D + G => Non unique or invalid shares

E + F => Non unique or invalid shares

E + G  => Non unique or invalid shares

E + H => I

I = auto flavor eagle alley horror culture capital nose ranch beauty sure notice WE HAVE ANOTHER PHRASE 'EAGLE'

E + I => J

J = vanish enlist paper junk off grunt typical october abuse jump absent cart NOTHING HERE

J + I => K

K = breeze brown jump modify radio opinion auction magic indicate favorite disease define WE HAVE ANOTHER PHRASE 'BROWN'

Each of the phrases below has at least one word that make up Share 'C' every other step gave the "Non unique or invalid shares" error. Meaning that the attacker has only very few words to play with and can get the mnemonic in no time.

Following this process the attacker was able to retrieve Six words including the check sum word. BUFFALO, EAGLE, COPY, TOE, BREEZE and the check sum word RIDE.

This vulnerability directly exposes the content of Share 'C' with only two known shares in a 3 of 5 scheme implementation.

My Email: [email protected]

(Edited): Please use my updated BTC address

BTC Address: bc1q3epnauwl4m7ukl2n4aygc2n3g3rj832kwg50n6

emmanueloshin avatar May 12 '23 09:05 emmanueloshin

I tried this method, but you also have to remember, that each time the program is run, it generates a different set of mnemonics for the same indices, and the words obtained may be present in any of the 3rd, 4th or 5th mnemonic shares (which happened in my test case).

The maximum number of mnemonic shares that can be generated in each run is 255, and each run yields different shares each time, so combining the threshold number of shares from multiple runs won't yield the same original mnemonic. For example:

##The original mnemonic for testing; will be split into 3 of 5 thresholds m = 'goose garden nuclear hire minute flavor just six reject broken ability damp'

shares1 = split_mnemonic(mnemonic=m, threshold=3, total=5) print(shares1) ###Output: 182: 'dinner leaf casual toward relief exact turn arrive dutch detect stem ranch', 206: 'life engine coin ketchup exchange rate swing glide lesson control refuse boil', 15: 'plate bronze churn vast hand cruel story west chapter coin change easily', 54: 'random frown clump upset surface reject forum gap egg drastic promote few', 58: 'all pig elder twin meat reveal giant arena across hungry fire express'}

shares2 = split_mnemonic(mnemonic=m, threshold=3, total=5) print(shares2) ##Output: 218: 'salute agree venue purity weird rate match swift side fault dwarf cable', 174: 'drastic tail drop season urban reopen loud buzz cook gallery apart merge', 63: 'stool dutch dilemma final predict tissue option embrace acquire boss dust mixture', 51: 'income sibling problem suspect vital trash govern apology flip liquid miracle finger', 252: 'laptop modify orange badge solid wheel hero foster excuse present lumber ripple'}

Only 3 of shares1 or 3 of shares2 will yield the original mnemonic 'm' when combined even though both of the batches may contain common words (like 'dutch' in 182 index in first run and index 63 in second run). They can't be mixed, as on every function call for 'split_mnemonic', a new list of 255 different mnemonics (max. limit) can be generated.

Do let me know if you find it that way, though.

SmartArt09 avatar Nov 22 '24 14:11 SmartArt09

@4tochka No please send here instead

bc1q3epnauwl4m7ukl2n4aygc2n3g3rj832kwg50n6

🙏 Thanks so much!!!!

On Sun, Apr 27, 2025, 10:22 AM Aleksey Karpov @.***> wrote:

4tochka left a comment (bitaps-com/pybtc#37) https://github.com/bitaps-com/pybtc/issues/37#issuecomment-2833347717

Please confirm that BTC Address: bc1qep7ln5dn6wkmefkw2vmy2r68sw49sde78fvu5x is still actual for your reward

— Reply to this email directly, view it on GitHub https://github.com/bitaps-com/pybtc/issues/37#issuecomment-2833347717, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7YOBNR3LWVQ57OP2Z2HJJT23SOTVAVCNFSM6AAAAABSJQDZZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZTGM2DONZRG4 . You are receiving this because you authored the thread.Message ID: @.***>

emmanueloshin avatar Apr 27 '25 11:04 emmanueloshin

Confirmed Vulnerability in Shamir Secret Sharing Implementation (Mnemonic Leak via Iterative Reconstruction)

I’m confirming and clarifying a vulnerability in the Shamir secret sharing logic combined with BIP-39 encoding in this repo after doing much research on my previous work.

Issue Summary

In a 3-of-5 Shamir secret sharing scheme, it is possible to recover partial content of a third share (e.g. C) using only two shares (e.g. A and B) through iterative recombination and mnemonic reconstruction.

Example:

If you perform:

A + B => D
A + D => E
A + E => F
...

and convert each resulting share into mnemonics, several words from the original share C begin to appear in later recombined outputs (e.g., BUFFALO, COPY, TOE, RIDE, etc.). As I've illustrated previously...

This violates the confidentiality guarantee of the (k-of-n) scheme: with fewer than k shares, no information about the secret or other shares should be obtainable.

Root Cause

The vulnerability stems from this line in pybtc/functions/shamir.py:

e = generate_entropy(hex=False)

This entropy is reused across multiple secret bytes:

for b in secret:
    q = [b]
    for i in range(threshold - 1):
        if e_i < len(e):
            a = e[e_i]
            e_i += 1
        else:
            e = generate_entropy(hex=False)
            a = e[0]
            e_i = 1
        q.append(a)

As a result, the coefficients of the secret-sharing polynomial leak patterns across shares, especially when the secret is small (e.g., a 12-word mnemonic). Combined with deterministic BIP-39 encoding, this allows attackers to extract parts of other shares via recombination.

Proposed Fixes

  1. Fix entropy reuse in split_secret():

Replace the entire entropy reuse logic with:

for b in secret:
    q = [b] + [generate_entropy(hex=False, length=1)[0] for _ in range(threshold - 1)]

This guarantees unique, fresh randomness for every share coefficient.

Optionally:

  1. Salt the BIP-39 encoding to break deterministic mapping:
salt = random.getrandbits(b)
i = (i << b) | salt

This makes each mnemonic unique even for the same entropy.

  1. Add a prefix/tag to valid shares to block synthetic recombinations:
# Add prefix during share generation
shares[z] = b'\xA5' + shares[z]
# Validate during reconstruction
if not share.startswith(b'\xA5'):
    raise ValueError("Invalid or synthetic share")

Conclusion

This issue allows attackers to extract partial data from shares they should not have access to — clearly violating the expectations of a threshold secret-sharing scheme.

@4tochka Let me know if you'd like me to submit a patch.

Thanks!

@emmanueloshin

P.S: Above BTC address remains valid

emmanueloshin avatar May 10 '25 22:05 emmanueloshin

the example you have given, is for this specific split, for others it is not working

iguha0 avatar Sep 06 '25 01:09 iguha0