krux
krux copied to clipboard
Krux encryption
note: While this PR is in draft form, I will continue to rebase atop develop and force-push, so please excuse me for doing so, with the assumption that this is not yet public until it is marked ready for review.
What is this PR for?
To solidify krux encryption:
- in preparation for more general encryption of content like user-defined-strings, xpubs, descriptors, maybe even psbts.
- towards a better defined API to encourage use by others. note: some forks (earthdiver, 3rditeration) already do.
Changes made to:
- [x] Code
- [x] Tests
- [ ] Docs
- [ ] CHANGELOG
What is the purpose of this pull request?
- [ ] Bug fix
- [x] New feature
- [ ] Docs update
- [x] Other
Codecov Report
Attention: Patch coverage is 97.96512% with 7 lines in your changes missing coverage. Please review.
Project coverage is 95.62%. Comparing base (
0a9aa35) to head (50969c4).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/krux/encryption.py | 95.23% | 3 Missing :warning: |
| src/krux/kef.py | 98.84% | 3 Missing :warning: |
| src/krux/pages/login.py | 88.88% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #546 +/- ##
===========================================
+ Coverage 95.56% 95.62% +0.05%
===========================================
Files 76 77 +1
Lines 8729 8952 +223
===========================================
+ Hits 8342 8560 +218
- Misses 387 392 +5
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Following (To include into SS fork)
Just a note that I'll likely end up backing-out changes in the previous commit (b761d26 until rebase) because NUL padding combined with authentication doesn't seem solvable to ensure decryption, for sure it's not "simple". I'm leaving it for now and exploring tests that show an exact failure rate (it's at least 1/256 existing encrypted-mnemonics that would require special handling) as well as ways to do special handling where false-positive authenticated decryption remains a possibility ~~(but a very rare possibility)~~ as rare as the checksum implies.
...and exploring tests that show an exact failure rate
~~Latest commit~~ The latest commits have non-"simple" handling of authenticated decryption for unsafe-padding (ECB/CBC versions 0, 1, 3, 4. ~~I'm unable to find a solution for similar with GCM~~), as well as a test that will report "failures" (defined as failed to decrypt AND we didn't avoid encryption OR KEF-encoding failed).
The report for a total of 1.4M+ samples per version (each has 7 different types of aligned plaintext + 7 of un-aligned plaintext; w/ samples variable in the test set at 100K) is below.
1st attempt at this report (commit: "adjust to support legacy v0/v1 to be handled like new versions...")
KEF Version Timid Avoid Fail Samples
0 AES-ECB 203129 0 0 1000000
1 AES-CBC 3907 0 0 1000000
2 AES-GCM 0 3868 0 1000000
3 AES-ECB v2 203114 0 0 1000000
4 AES-CBC v2 3868 0 0 1000000
5 AES-GCM +p 0 0 0 1000000
6 AES-ECB +p 200006 0 0 1000000
7 AES-CBC +p 0 0 0 1000000
8 AES-GCM +c 0 0 0 1000000
9 AES-ECB +c 0 0 0 1000000
10 AES-CBC +c 0 0 0 1000000
2nd attempt at this report (commit: "TIL: mode GCM doesn't ever require padding...")
KEF Version Timid Avoid Fail Samples
0 AES-ECB 224594 0 0 1400000
1 AES-CBC 5464 0 0 1400000
2 AES-GCM 0 0 0 1400000
3 AES-ECB v2 223093 0 0 1400000
4 AES-CBC v2 3980 0 0 1400000
5 AES-ECB +p 219979 0 0 1400000
6 AES-CBC +p 0 0 0 1400000
7 AES-GCM +c 0 0 0 1400000
8 AES-ECB +c 0 0 0 1400000
9 AES-CBC +c 0 0 0 1400000
Failure Summary:
Ver Ver Name Timid Avoid Fail KEFerr Samples
0 AES-ECB 237996 0 0 0 1400000
1 AES-CBC 5459 0 0 0 1400000
2 AES-GCM 0 0 0 0 1400000
3 AES-ECB v2 236591 0 0 0 1400000
4 AES-CBC v2 4166 0 0 0 1400000
5 AES-ECB +p 233408 0 0 0 1400000
6 AES-CBC +p 0 0 0 0 1400000
7 AES-GCM +c 0 0 0 0 1400000
8 AES-ECB +c 0 0 0 0 1400000
9 AES-CBC +c 0 0 0 0 1400000
Per-Version Failure Details:
Ver Function Count Description
0 encrypt 232537 ValueError('Duplicate blocks in ECB mode')
0 encrypt 5459 ValueError('Cannot validate decryption for this plaintext')
1 encrypt 5459 ValueError('Cannot validate decryption for this plaintext')
3 encrypt 232425 ValueError('Duplicate blocks in ECB mode')
3 encrypt 4166 ValueError('Cannot validate decryption for this plaintext')
4 encrypt 4166 ValueError('Cannot validate decryption for this plaintext')
5 encrypt 233408 ValueError('Duplicate blocks in ECB mode')
Types of plaintext messages, all via deterministic hashes, are:
- 16 byte as-if 12w entropy, + re-encoded utf8
- 32 byte as-if 24w entropy, + re-encoded utf8
- 12w mnemonic, + re-encoded utf8 -- a repeat since already utf8
- 24w mnemonic, + re-encoded utf8 -- a repeat since already utf8
- 32 bytes where 1st and 2nd block are same, + re-encoded utf8
- 64 bytes, + re-encoded utf8
- all of above concatenated, + re-encoded utf8
Columns
- Timid: avoided encryption but were actually capable of authenticated decryption
- Avoid: avoided encryption, fortunately, because also failed authenticated decryption
- Fail: didn't avoid encryption and FAILed authenticated decryption
- KEFerr: FAILed successful round-trip of KEF encoding
In the report:
- Timid values for ECB are raising-error on .encrypt() because 1) plaintext or auth bytes end in 0x00 and unsafe padding, and 2) repeated aes-blocks which reveal "repeated" because ECB ciphertext repeats -- but in all cases we were able to .decrypt() (w/ internal retries).
- Timid values for CBC are raising-error on .encrypt() because plaintext or auth bytes end in 0x00 and unsafe padding, but in all cases we were able to .decrypt() (w/ internal retries).
- ~~Avoid values for GCM are raising-error on encrypt because plaintext ends in 0x00, which is fortunate because "authenticated" decryption failed also.~~ There are no Avoid values for GCM because TIL: GCM doesn't require padding and has been defined to have
None-- which means that "AES-GCM +p" is no longer a proposed version.
Note: versions 5-9, with +p or +c use safe padding (+c uses compression -- which likely disguises block repeats in ECB)
A new function suggest_versions() decides on the KEF version to use when encrypting. This includes whether or not to use compression of plaintext.
It's currently using a threshold of 160 bytes with some exceptions, but I've tried some different bytestring samples to search for a better threshold. Results are below.
thresh: 192 for 192b content: b'\x1f@\xfc\x92\xda$\x16\x94u\ty\xeel\xf5\x82\xf2\xd5\xd7\xd2\x8e\x183]\xe0Z\xbcT\xd0V\x0e\x0fS\x02\x86\x0ce+\xf0\x8dV\x02R\xaa^t!\x05F\xf3i'
thresh: 79 for 287b content: b'b43:6X5:TWF835+69CT:6E9*N3Z5-TKLCKEBV.62XRO+5NSD$S'
thresh: 112 for 266b content: b'b58:HNRPiyKyPzubRZTXtxEXigjzQiPfzNx9Ceoa9xAUzw6jgf'
thresh: 97 for 260b content: b'b64:H0D8ktokFpR1CXnubPWC8tXX0o4YM13gWrxU0FYOD1MChg'
thresh: 126 for 424b content: b'wsh(sortedmulti(2,[d63dc4a7/48h/1h/0h/2h]tpubDEXCv'
thresh: 87 for 629b content: b'b43:U3VW:FB$5J208:+ALM6IYU43X6UUPI:WTL-YK19*70H8:.'
thresh: 111 for 583b content: b'b58:YigQL3LYRA8K123rmiXLNBE2M7HCyyyD1pUZha3j9q4BPo'
thresh: 58 for 572b content: b'b64:d3NoKHNvcnRlZG11bHRpKDIsW2Q2M2RjNGE3LzQ4aC8xaC'
thresh: 25 for 13115b content: b'abandon ability able about above absent absorb abs'
thresh: 84 for 19340b content: b'b43:3ID.D53$41H9GFXH/YLS-I.ED.WQB7197A9L5EL*TM-N-1'
thresh: 110 for 17915b content: b'b58:5VHpEs12tM3J4nVMeJfEyWEKeAVSS713bQTXPU65BwdJgA'
thresh: 42 for 17492b content: b'b64:YWJhbmRvbiBhYmlsaXR5IGFibGUgYWJvdXQgYWJvdmUgYW'
In the above analysis, a threshold was considered "better" if compressing the sample of this size resulted in fewer bytes than not compressing it. It did not take into account processing overhead of compression, only on whether or not there was a "size" win.
For random bytes (like mnemonic entropy), compression is never smaller. For english words, there is benefit for really short strings, ie: 25 in the case of the bip39 words list. For others, 80 to 120 seems like a much better "compress" threshold than 160.
suggest_versions() is called with the "plaintext" bytes and the user's AES "mode" preference, so it is possible to make an intelligent per-plaintext decision. On the other hand, assuming that this is only used for plaintext that the user considers a sensitive secret, I want to remain careful about how much analysis is performed on their secrets prior to encryption (other than absolutely necessary like checking to see if repeat blocks in mode AES-ECB) .
No responses necessary, just sharing these thoughts.
As simulator created (Crypto.Cipher.aes MODE_CTR) controls since MODE_CTR has arrived in MaixPy.ucryptolib:
These can be read from Tools/Datum Tool in "kef_ui_prototype" branch
"im sixteen bytes", version: "AES-CTR", key: "k", encoding: "base43"
"2of3 multisig descriptor", version: "AES-CTR +c", key: "k", encoding: "base43"
12-layer Matryoshka, a text message wrapped in consecutive envelopes for every version supported.
All keys are "abc", encoding: "binary"