secp256k1 icon indicating copy to clipboard operation
secp256k1 copied to clipboard

Unnecessary call to secp256k1_sha256_initialize

Open Coding-Enthusiast opened this issue 1 year ago • 5 comments

When computing tagged-hashes for Schnorr sigs the 3 methods (challenge, aux, nonce) first call secp256k1_sha256_initialize that sets the hashstate (ie. s[0] to s[7] and bytes) to their default SHA256 values then they each immediately change all those values to the precomputed "midstate" values. The first call to secp256k1_sha256_initialize seems wasteful.

https://github.com/bitcoin-core/secp256k1/blob/9a8d65f07f171b07bd7a33828dce073d819fbdef/src/modules/schnorrsig/main_impl.h#L16-L28

https://github.com/bitcoin-core/secp256k1/blob/9a8d65f07f171b07bd7a33828dce073d819fbdef/src/hash_impl.h#L31-L41

Cross post: https://github.com/bitcoin/bitcoin/issues/26712

Coding-Enthusiast avatar Dec 16 '22 17:12 Coding-Enthusiast

I think the call is correct at this level of abstraction. The caller in the schnorrsig module is not supposed to know what secp256k1_sha256_initialize() does, e.g., it could initialize further struct members in the future. Moreover, it's not a performance problem because any sane compiler will hopefully drop the call.

However, that means that one could say that the code in the schnorrsig module shouldn't set the struct members directly. This was also discussed in https://github.com/bitcoin-core/secp256k1/pull/731 but I ended up not doing it. I think my reasoning was that a midstate is really just a triple of the state of "internal" arrays, buffer of unwritten bytes and bytes written so far. This matches the struct exactly. So one way to look at this is that the struct is "public" to other modules, and those modules can simply write it to.

If we really want to fix this code smell, here are three ways:

  1. We could still introduce a setter function but the API would be somewhat annoying due to the 8 integers plus the array (which means we need a pointer). https://github.com/bitcoin-core/secp256k1/pull/731/files#diff-c2d5f1f7616875ab71cd41b053cfb428696988ff89642b931a0963d50f34f7e8R557-R558.
  2. Alternatively, we could restrict the API to midstates with no unwritten bytes (i.e., where the byte counter is divisible by 64). This is all the callers currently need and then we could at least drop the pointer argument. This would mean we introduce a function secp256k1_sha256_initialize_midstate(uint64_t bytes, uint32_t s1, ..., uint32_t s2). (Or alternatively make the 8 arguments an array if 8 arguments feel too many. But I think 8 args is still fine and they provide more type-safety than an array.)
  3. We could make the hash module offer specific initialization functions, e.g., secp256k1_sha256_initialize_bip340_nonce. This works but it feels a little bit wrong to have the bip340-specific code in the hash module.

I prefer solution 2. @Coding-Enthusiast, would be by any chance be interested in working on this?

real-or-random avatar Dec 19 '22 09:12 real-or-random

My C knowledge is readonly!

How about letting the hash module handle the tagged hash computation in general. Something like what bitcoin core project does in the interpreter.cpp file using HashWriter which I believe is reusable and also doesn't need hard-coded midstate values.

Coding-Enthusiast avatar Dec 19 '22 10:12 Coding-Enthusiast

and also doesn't need hard-coded midstate values.

We want hard-coded midstates for performance.

real-or-random avatar Dec 19 '22 10:12 real-or-random

I see. My understanding is that HashWriter precomputes the midstates and is stored in memory to be reused so the performance should be the same.

Coding-Enthusiast avatar Dec 19 '22 10:12 Coding-Enthusiast

@Coding-Enthusiast That's not easily doable in C, as it has no runtime-computed constants. We'd need to store it in the context or so, and compute it on initialization. I think that's overkill for what we're doing here.

sipa avatar Dec 19 '22 14:12 sipa