cryptography
cryptography copied to clipboard
Streaming API for XOFs
The SHAKE family of extensible-output-functions are sometimes used as e.g. a deterministic random number generator in the following pattern (with functions named per the sponge nature of Keccak):
# pseudocode
xof = xof.new()
xof.absorb(bytes)
xof.absorb(bytes)
xof.finalize() # absorb should fail now
ten_bytes_of_output = xof.squeeze(10)
another_1000_bytes = xof.squeeze(1000)
(finalize may be implicit in the first squeeze, note that you usually can't absorb, squeeze, and absorb again without keeping the pre-finalize state).
The current API of shake256 supported by both Python's own hashlib and by cryptography return the same bytes every time you call .digest(len).
References:
- This small class turns
hashlib's implementation in a streaming interface https://github.com/GiacomoPope/dilithium-py/blob/a431369cb639c2e161e2cd9ef69fdd1eef033801/shake_wrapper.py - Raccoon (post-quantum signature scheme) uses Shake256 as a deterministic random number generator: https://github.com/masksign/raccoon/blob/72a5cf077e5f0a898a453ba84d778c550cd0a203/ref-py/racc_core.py#L334-L340
- Cryptodome does support XOF streaming https://www.pycryptodome.org/src/hash/hash#extensible-output-functions-xof
- https://github.com/openssl/openssl/pull/7921
N.b. This relates somewhat to #2358, but that one seems more encryption-focused.
Our APIs are limited to what OpenSSL is capable of right now and, as you found in openssl/openssl#7921, OpenSSL can't currently do repeated squeezing. The most recent traffic on that PR suggests they want to change it, but it doesn't look like there's been much traction.
(I am broadly supportive of adding this as soon as OpenSSL allows it or if we can find some other mechanism that isn't ruinous for performance)
as you found in openssl/openssl#7921, OpenSSL can't currently do repeated squeezing
The replacement PR for that got merged for OpenSSL 3.3 about a week ago.
I just thought I'd share my own wrapper class for anyone else trying to work around this until there's a proper solution. I think mine is marginally more efficient than the dilithium-py wrapper linked above.
class ShakeStream:
def __init__(self, digestfn) -> None:
# digestfn is anything we can call repeatedly with different lengths
self.digest = digestfn
self.buf = self.digest(32) # arbitrary starting length
self.offset = 0
def read(self, n: int) -> bytes:
# double the buffer size until we have enough
while self.offset + n > len(self.buf):
self.buf = self.digest(len(self.buf) * 2)
res = self.buf[self.offset:self.offset + n]
self.offset += n
return res
if __name__ == "__main__":
from hashlib import shake_128
a = ShakeStream(shake_128(b"hello").digest)
foo = a.read(17) + a.read(5) + a.read(57) + a.read(1432) + a.read(48)
bar = shake_128(b"hello").digest(17 + 5 + 57 + 1432 + 48)
assert(foo == bar)
Just to put the information in this thread: "The next feature release after OpenSSL 3.2 will be OpenSSL 3.3, which will be released no later than 30 April 2024" (https://www.openssl.org/blog/blog/2023/11/23/OpenSSL32/index.html), so we can look a bit more closely at implementing this support soon-ish.