cryptography icon indicating copy to clipboard operation
cryptography copied to clipboard

Streaming API for XOFs

Open thomwiggers opened this issue 2 years ago • 4 comments

The SHAKE family of extensible-output-functions are sometimes used as e.g. a deterministic random number generator in the following pattern (with functions named per the sponge nature of Keccak):

# pseudocode
xof = xof.new()
xof.absorb(bytes)
xof.absorb(bytes)
xof.finalize()  # absorb should fail now
ten_bytes_of_output = xof.squeeze(10)
another_1000_bytes = xof.squeeze(1000)

(finalize may be implicit in the first squeeze, note that you usually can't absorb, squeeze, and absorb again without keeping the pre-finalize state).

The current API of shake256 supported by both Python's own hashlib and by cryptography return the same bytes every time you call .digest(len).

References:

  • This small class turns hashlib's implementation in a streaming interface https://github.com/GiacomoPope/dilithium-py/blob/a431369cb639c2e161e2cd9ef69fdd1eef033801/shake_wrapper.py
  • Raccoon (post-quantum signature scheme) uses Shake256 as a deterministic random number generator: https://github.com/masksign/raccoon/blob/72a5cf077e5f0a898a453ba84d778c550cd0a203/ref-py/racc_core.py#L334-L340
  • Cryptodome does support XOF streaming https://www.pycryptodome.org/src/hash/hash#extensible-output-functions-xof
  • https://github.com/openssl/openssl/pull/7921

N.b. This relates somewhat to #2358, but that one seems more encryption-focused.

thomwiggers avatar Jul 06 '23 08:07 thomwiggers

Our APIs are limited to what OpenSSL is capable of right now and, as you found in openssl/openssl#7921, OpenSSL can't currently do repeated squeezing. The most recent traffic on that PR suggests they want to change it, but it doesn't look like there's been much traction.

(I am broadly supportive of adding this as soon as OpenSSL allows it or if we can find some other mechanism that isn't ruinous for performance)

reaperhulk avatar Jul 06 '23 10:07 reaperhulk

as you found in openssl/openssl#7921, OpenSSL can't currently do repeated squeezing

The replacement PR for that got merged for OpenSSL 3.3 about a week ago.

h-vetinari avatar Nov 20 '23 04:11 h-vetinari

I just thought I'd share my own wrapper class for anyone else trying to work around this until there's a proper solution. I think mine is marginally more efficient than the dilithium-py wrapper linked above.

class ShakeStream:
	def __init__(self, digestfn) -> None:
		# digestfn is anything we can call repeatedly with different lengths
		self.digest = digestfn
		self.buf = self.digest(32) # arbitrary starting length
		self.offset = 0
	
	def read(self, n: int) -> bytes:
		# double the buffer size until we have enough
		while self.offset + n > len(self.buf):
			self.buf = self.digest(len(self.buf) * 2)
		res = self.buf[self.offset:self.offset + n]
		self.offset += n
		return res


if __name__ == "__main__":
	from hashlib import shake_128

	a = ShakeStream(shake_128(b"hello").digest)
	foo = a.read(17) + a.read(5) + a.read(57) + a.read(1432) + a.read(48)
	bar = shake_128(b"hello").digest(17 + 5 + 57 + 1432 + 48)
	assert(foo == bar)

DavidBuchanan314 avatar Dec 24 '23 13:12 DavidBuchanan314

Just to put the information in this thread: "The next feature release after OpenSSL 3.2 will be OpenSSL 3.3, which will be released no later than 30 April 2024" (https://www.openssl.org/blog/blog/2023/11/23/OpenSSL32/index.html), so we can look a bit more closely at implementing this support soon-ish.

reaperhulk avatar Jan 26 '24 19:01 reaperhulk