secp256k1 icon indicating copy to clipboard operation
secp256k1 copied to clipboard

Add silentpayments (BIP352) module

Open theStack opened this issue 1 year ago • 29 comments

This PR adds a new Silent Payments (BIP352) module to secp256k1. The following routines are provided ($a_i$ are input private keys, $A_i$ are input public keys, $b$ and $B$ denote recipient privkeys/pubkeys that would be encoded in silent payment addresses, $d$ and $P$ the keypair for the actual transaction taproot x-only output):

Side Function Inputs Outputs
Sender _create_private_tweak_data $a_1...a_n$, $outpoint_L$ $a_{sum} = (a_1 + a_2 + ... + a_n)$
$inputhash = hash_I(outpoint_L || (a_{sum} * G))$
Receiver _create_public_tweak_data $A_1...A_n$, $outpoint_L$ $A_{sum} = (A_1 + A_2 + ... + A_n)$
$inputhash = hash_I(outpoint_L || A_{sum})$
Receiver _create_tweaked_pubkey $A_{sum}, inputhash$ $A_{tweaked} = inputhash * A_{sum}$
Both _create_shared_secret $Pub$, $sec$
(Sender: $B_{scan}, a_{sum}$
Receiver: $A_{sum}, b_{scan}$
Lightclient: $A_{tweaked}, b_{scan}$)
$SS = (inputhash * sec) * Pub$ (ECDH)
Receiver _create_label_tweak $b_{scan}, m$ $labeltweak = hash_L(b_{scan} || m)$
$label = labeltweak * G$
Receiver _create_address_spend_pubkey $B_{spend}, label$ $B_m = B_{spend} + label$
Sender _sender_create_output_pubkey $SS, B_m, k$ $P_{xonly} = B_m + hash_S(SS || k) * G$
Receiver _receiver_scan_output $SS, B_m, k, tx_{output}$ $t_k = hash_S(SS || k)$
$P_{xonly} = B_m + t_k * G$ [not returned]
$directmatch = P_{xonly} == tx_{output}$
if $directmatch == 0$:
$\quad label1 = tx_{output} - P$
$\quad label2 = -tx_{output} - P$
Receiver _create_output_seckey $b_{spend}, t_k, (labeltweak)$ $d = (b_{spend} + labeltweak) + t_k$

where

  • $hash_I$ denotes a SHA256 tagged hash with tag "BIP0352/Inputs"
  • $hash_L$ denotes a SHA256 tagged hash with tag "BIP0352/Label"
  • $hash_S$ denotes a SHA256 tagged hash with tag "BIP0352/SharedSecret"

For ending up with output key material used for sending to / scanning for / spending from, both sides would follow the chain of tweak_data -> shared_secret -> output key. The public tweak data can be useful for faster scanning of output transactions by storing them in an index, see e.g. Bitcoin Core PR https://github.com/bitcoin/bitcoin/pull/28241. Private tweak data is arguably less useful, so in theory one could collapse the tweak data and shared secret creation functions into a single one, but IMHO it's nicer if the API is symmetric.

As discussed in https://github.com/bitcoin-core/secp256k1/issues/1427#issuecomment-1757675172, the approach of passing in two separate key pairs for taproot and non-taproot inputs is followed here. This may seem a bit confusing at first, but has the advantage that the caller doesn't have to deal with enforcing even y-parity for key material manually (e.g. negating private keys of taproot outputs if they would end up in an odd point), which seems error-prone.

The last commit contains the BIP352 test vectors, converted to C code with a Python script. An earlier version of the tests, directly written in Python (by calling in to the secp256k1 shared library using ctypes) can still be found here: https://github.com/theStack/secp256k1/tree/silentpayments-module-demo_old5

theStack avatar Dec 23 '23 17:12 theStack

Thanks for starting this! Will review. Just wanted to comment that I've made a round of edits to the BIP and it is now updated with the new hashing method (along with the tests).

The PR assumes that the outpoint hash is provided by the caller, but this could change in the future if its calculation involves elliptic-curve operations that would be a good fit for being done within the module as well.

The hash is calculated as hash(outpointL || Asum). Since it commits to the sum of input public keys used, seems like it might make sense to move it into the module? Otherwise, we'd have to do something like calculate the sum of the public keys, return that to the caller so that they can calculate an input hash and then send it back to us.

josibake avatar Jan 10 '24 22:01 josibake

Thanks for starting this! Will review. Just wanted to comment that I've made a round of edits to the BIP and it is now updated with the new hashing method (along with the tests).

👍

The PR assumes that the outpoint hash is provided by the caller, but this could change in the future if its calculation involves elliptic-curve operations that would be a good fit for being done within the module as well.

The hash is calculated as hash(outpointL || Asum). Since it commits to the sum of input public keys used, seems like it might make sense to move it into the module?

Yes, I agree. A naive solution based on the current PR state would be to provide another function for calculating the input hash and leave the others as they are, but then we would need to pass the input pubkeys and calculate the pubkey sum twice (once for the input hash, and once for the pubkey tweak data), which should be avoided. I guess we want to introduce a function for calculating $A_{sum}$ to reuse that result. Will try to figure out an interface. Suggestions would be greatly appreciated :)

theStack avatar Jan 12 '24 11:01 theStack

but then we would need to pass the input pubkeys and calculate the pubkey sum twice

(I need to actually review your implementation, but..) I don't think we need to do it twice? On the sender side, you pass in the private keys, add those up (call it priv_key_sum) and then you can generate the pubkey sum from the private keys (e.g. priv_key_sum.GetPubKey(). On the receiving side, you pass in the pubkeys, sum them, use them in the hash and then use them in the shared secret derivation.

josibake avatar Jan 16 '24 17:01 josibake

but then we would need to pass the input pubkeys and calculate the pubkey sum twice

(I need to actually review your implementation, but..) I don't think we need to do it twice? On the sender side, you pass in the private keys, add those up (call it priv_key_sum) and then you can generate the pubkey sum from the private keys (e.g. priv_key_sum.GetPubKey(). On the receiving side, you pass in the pubkeys, sum them, use them in the hash and then use them in the shared secret derivation.

Oh indeed, I missed that after calculating the sum of private keys, summing up the pubkeys is not necessary anymore.

So if I'm not overlooking something, the interface change should be as simple as replacing the "outpointhash" parameter in the tweak functions with an "outpoint_L" parameter. Will tackle that in a bit.

theStack avatar Jan 17 '24 18:01 theStack

I've force-pushed with a version that is now up to date with the latest state of the BIP, and updated the PR description accordingly. Labels support is also included, and all the test vectors from the BIP pass. The tests are still run from a Python script that directly interacts with the shared library via ctypes, I'll hopefully have something ready soon to create the actual tests in tests_impl.h from that automatically. The code was changed in many places to operate internal data types and functions in the routines (e.g. secp256k1_ge instead of secp256k1_pubkey, secp256k1_scalar instead of uchar-pointers to 32-byte chunks), which seems to make more sense and have less overhead.

(The previous version of the PR is still available here: https://github.com/theStack/secp256k1/tree/silentpayments-module-demo_old)

theStack avatar Jan 24 '24 17:01 theStack

@josibake: Thanks for the initial review, very much appreciated!

Another thought I had (need to actually re-review more carefully to confirm) is we might be able to minimize the de-serializations of keys (moving from bytes -> point). From my understanding, this is one of the "expensive" operations as it involves calculating the y value. This would mean either combining things into larger routines, or having the functions return points instead of serialized pubkeys. Curious what you think @theStack

Good point, that's also something I've been asking myself. I think the expensive part of calculating the y-value only applies for loading secp256k1_xonly_pubkey objects though (not for regular secp256k1_pubkeys), and since those are not used for any intermediate results, it (hopefully) shouldn't be too bad. Directly using the group element type is not possible I think, as that datatype (secp265k1_ge) is internal and not exposed by the API. Would be great to get input from experienced long-term contributors in that subject. // EDIT: oh, you probably meant the raw 33-byte results of the ~~shared secret creation and~~ public tweak data. Maybe those should be changed to secp256k1_pubkey instead.

theStack avatar Jan 26 '24 18:01 theStack

Force-pushed with the following changes:

  • parameter naming: s/outpoint_lowest/outpoint_smallest/, s/tweak_data32/private_tweak_data32/, s/tweak_data33/public_tweak_data/ (the data type for the last one was changed, see below)
  • API description: s/for each input/for each silent payment eligible input/
  • changed the result type of the public tweak data routine to secp256k1_pubkey to avoid expensive deserialization after in the shared secret creation routine (in light of comment https://github.com/bitcoin-core/secp256k1/pull/1471#pullrequestreview-1846116671); AFAICT, this was the only deserialization of public keys that happened for intermediate results?
  • removed unnecessary parity handling of x-only-pubkeys, as they always have an even y (https://github.com/bitcoin-core/secp256k1/pull/1471#discussion_r1467904182) and adapted the API description accordingly
  • added explicit checks for invalid sum results in the tweak creation routines (zero for the private keys sum, point of infinity for public keys sum) with an open TODO whether we want to have a special return code in this case. IIUC, for the public keys sum case, this case could occur during scanning if someone crafted a transaction where the input pubkeys cancel each other out (trivially reachable with two pubkeys P and -P). It might make sense to have a return code signalling "ignore this tx, it's not suitable for silent payments" to differentiate it from an actual error that happened.

theStack avatar Jan 27 '24 03:01 theStack

EDIT: oh, you probably meant the raw 33-byte results of the shared secret creation and public tweak data. Maybe those should be changed to secp256k1_pubkey instead.

Yep! Sorry, my original comment wasn't very clear. But to be precise, what I'm referring to is anytime we have to de-serialize a public key encoding into a point (be it compressed or x-only), we need to calculate a sqrt to get the y-value, which is the "expensive" part. So if we only de-serialize once (taking the inputs from the user) and then for the rest of the process use points (i.e. (x, y) pairs) until the final step where we return the generated x-only pubkeys to the user, this should save us some work!

At a quick glance, it looks like you addressed this in your most recent push!

josibake avatar Jan 27 '24 10:01 josibake

It might make sense to have a return code signalling "ignore this tx, it's not suitable for silent payments" to differentiate it from an actual error that happened.

Thinking about this a bit more, the errors that can happen are:

  1. De-serialization errors: I pass some bytes that cannot be de-serialized into a valid public key
  2. Hash returns something greater than the curve order: extremely unlikely to ever happen
  3. Pubkey / private key sum is point at infinity / 0: extremely unlikely or malicious

I think in all cases the expected user behavior is to move on. For the sender, if they run into any of these errors they would need to pick a different set of inputs (i.e. make a new transaction). For the receiver, their only recourse is to skip the transaction.

I'm not sure what the added value to the user is here if we return different error codes.

josibake avatar Jan 27 '24 10:01 josibake

It might make sense to have a return code signalling "ignore this tx, it's not suitable for silent payments" to differentiate it from an actual error that happened.

Thinking about this a bit more, the errors that can happen are:

  1. De-serialization errors: I pass some bytes that cannot be de-serialized into a valid public key
  2. Hash returns something greater than the curve order: extremely unlikely to ever happen
  3. Pubkey / private key sum is point at infinity / 0: extremely unlikely or malicious

Leaving 2. aside (from what I understand, it's fine to just ignore those unlikely hash-not-within-curve-order cases), I think the difference between 1. and 3. is that the first suggests that the user is using the API in a wrong way (i.e. shouldn't ever happen in a correct implementation), while 3. can be triggered externally for the pubkey sum case, see below.

I think in all cases the expected user behavior is to move on. For the sender, if they run into any of these errors they would need to pick a different set of inputs (i.e. make a new transaction). For the receiver, their only recourse is to skip the transaction.

I'm not sure what the added value to the user is here if we return different error codes.

Yeah, I tend to agree. The rationale behind introducing those TODOs was to consider differentiating between the cases "invalid input data is passed", indicating that the user did something wrong (case 1 above) and "the individual input data passed is fine, but we still can't continue" (case 3 above). Since the transactions appearing in the mempool / block chain are not in the control of the user, such a "point of infinity" case could be triggered externally in the course of scanning for silent payments.

In Bitcoin Core, at some places we call secp256k1 functions with an additional assert(ret), as we are confident that the input data passed is fine and the function always succeeds, e.g. several times in CKey::SignCompact: https://github.com/bitcoin/bitcoin/blob/5fbcc8f0560cce36abafb8467339276b7c0d62b6/src/key.cpp#L255

In case of the pubkey tweak data creation routine, that would be a mistake as an external transaction could make a node crash in the course of e.g. the silent payment index creation. I guess we can avoid this though by just properly documenting in the API that the return code 0 also could mean "tx is not eligible for silent payments"? Maybe I'm overthinking here and users would hopefully always check for return values for more complex routines.

theStack avatar Jan 28 '24 18:01 theStack

"invalid input data is passed", indicating that the user did something wrong (case 1 above) and "the individual input data passed is fine, but we still can't continue" (case 3 above)

This is a good point. In theory, a user could recover from case 1, which at that point I'd agree we need two error codes: one to indicate that the user needs to do something different with the current transaction in order to proceed, and another to indicated the can't do anything with the current transaction and it needs to be skipped.

In practice, I don't think case 1 is likely to happen often since the inputs already exist in validated transactions, it's just a matter of correctly extracting them. That being said, I am slightly leaning towards two error codes but I'd also prefer to be consistent with the other modules if we need a tie breaker. I'm curious if there is a general pattern in the library that we can take inspiration from.

josibake avatar Jan 28 '24 19:01 josibake

Changed the tweak data creation interfaces to take lists of pointers to seckeys/pubkeys (instead of expecting the data to come in concatenated form), as suggested in the discussion https://github.com/bitcoin-core/secp256k1/pull/1471#discussion_r1475018645. Took me a bit to figure out how to properly create the array of pointers with ctypes in the secp256k1 glue module for the Python test-suite, but now everything passes again.

theStack avatar Feb 04 '24 21:02 theStack

Force-pushed with interface changes in the tweak data and shared secret creation routines, following the latest suggestion in https://github.com/bitcoin-core/secp256k1/pull/1471#discussion_r1478572607 / https://github.com/bitcoin-core/secp256k1/pull/1471#discussion_r1480186475. The tweak data creation routines now return both the key sums and input hash, without doing any tweaks yet, in order to avoid extra point multiplication for the full node receiver case (see discussion https://github.com/bitcoin-core/secp256k1/pull/1471#discussion_r1468836824). Also, there is now only one shared secret creation routine left, with the parameters named public_component, private_component and input_hash. I tried documenting in the API description on what has to be passed depending on the scenario; unsure if this is sufficient yet, maybe there is a better, more clear way of presenting the expected parameters to the user (e.g. summarizing it in a table).

One open question is if we really need a dedicated routine to calculate A_tweaked = input_hash * A_sum (for light clients / silent payment indexes...). That routine would essentially be a copy of secp256k1_ec_pubkey_tweak_mul. Is it okay for the user to demand calling this routine, if it's properly documented, or do we still want do have a dedicated routine secp256k1_silentpayments_create_tweaked_pubkey that does the same, making it clear in the API what has to be passed?

Thanks for your review comments regarding labels and your POC branch @josibake, I will look at this closer in a bit.

theStack avatar Feb 10 '24 22:02 theStack

One open question is if we really need a dedicated routine to calculate A_tweaked = input_hash * A_sum (for light clients / silent payment indexes...). That routine would essentially be a copy of secp256k1_ec_pubkey_tweak_mul. Is it okay for the user to demand calling this routine, if it's properly documented, or do we still want do have a dedicated routine secp256k1_silentpayments_create_tweaked_pubkey that does the same, making it clear in the API what has to be passed?

I have a slight preference for wrapping secp256k1_ec_pubkey_tweak_mul in a dedicated secp256k1_silentpayments_created_tweaked_pubkeys routine. This way, you can do everything you need to do with just the silent payments module and don't really need to worry about the rest of libsecp256k1. In general, I think the goal for libsecp256k1 is to further restrict the API and stop exposing routines for low level operations, so always wrapping in our own silent payments specific routines seems better.

josibake avatar Feb 13 '24 11:02 josibake

I've rewritten #28122 to use this module here: https://github.com/josibake/bitcoin/tree/implement-bip352-secp256k1-module. So far, so good! I'll take a fresh look here tomorrow with some feedback, and then update #28122 once I've cleaned up the branch a bit and we are ready to move this PR out of draft. @theStack can you remind me what your criteria for considering this PR "not a draft" were?

josibake avatar Feb 14 '24 18:02 josibake

I've rewritten #28122 to use this module here: https://github.com/josibake/bitcoin/tree/implement-bip352-secp256k1-module. So far, so good! I'll take a fresh look here tomorrow with some feedback, and then update #28122 once I've cleaned up the branch a bit and we are ready to move this PR out of draft. @theStack can you remind me what your criteria for considering this PR "not a draft" were?

Nice, only skimmed over the branch so far but it's good to see the module in full action. As for the out-of-draft criteria, I think we are pretty close, in my view the only two points missing are:

  • add another routine for creating $A_{tweaked}$, as needed for light clients / indexes (trivial, as it's basically a wrapper around secp256k1_ec_pubkey_tweak_mul), https://github.com/bitcoin-core/secp256k1/pull/1471#issuecomment-1941333889
  • having more test coverage in plain C, I think every routine should at least be called once (ideally all test vectors are converted to C code by the Python script, like e.g. done for MuSig2: https://github.com/jonasnick/secp256k1/blob/musig2-module/tools/test_vectors_musig2_generate.py)

theStack avatar Feb 14 '24 23:02 theStack

@josibake: I've add the missing _create_tweaked_pubkey function to the API. As far as I'm aware, a user would have all the necessary routines now that are needed for a full silent payments implementation, without the need to do any manual ECC calculations (see also the summary table with formulas in the PR description). If there are no further objections, I'll move the PR out of draft within the next days, after increasing the test coverage a bit more.

theStack avatar Feb 16 '24 13:02 theStack

@josibake: I've add the missing _create_tweaked_pubkey function to the API. As far as I'm aware, a user would have all the necessary routines now that are needed for a full silent payments implementation, without the need to do any manual ECC calculations (see also the summary table with formulas in the PR description). If there are no further objections, I'll move the PR out of draft within the next days, after increasing the test coverage a bit more.

Looks great! I've updated #28122 to use this branch. I think we can take this out of draft whenever you're ready. While I'm sure the API will change with further review, this seems like a pretty solid starting point!

josibake avatar Feb 19 '24 16:02 josibake

What is the input k in _sender_create_output_pubkey?

I find the label stuff fairly confusing, both in the BIP and in the description here. But no concrete suggestion on how to improve that...

Sjors avatar Feb 20 '24 09:02 Sjors

What is the input k in _sender_create_output_pubkey?

k is a counter used by the receiver to create multiple outputs for the sender. Using the variable k here to match the notation in the BIP (see BIP352#creating-outputs)

josibake avatar Feb 20 '24 09:02 josibake

Force-pushed with changes regarding tests and CI. The demo-only python test runner script has now been replaced by one that converts the BIP352 test vector data in JSON format to C code (./src/modules/silentpayments/vectors.h), that is in turn included and evaluated in the main test file (./src/modules/silentpayments/tests_impl.h). The last test vector is skipped in the generation script, as is it doesn't contain any silent payment eligible inputs; since the tweak data creation APIs require that at least one input is passed, a user would detect that and not call into secp256k1-silentpayments in such a case in the first place.

Looks great! I've updated #28122 to use this branch. I think we can take this out of draft whenever you're ready. While I'm sure the API will change with further review, this seems like a pretty solid starting point!

Agree. I think there is still a long list of things that could/should be added (some manual tests e.g. for edge-cases, constant-time tests, benchmarks, examples, a proper doc/silentpayments.md document... anything else?), so the PR still feels early, but hopefully good enough as starting point. CI just showed green (enabled for silentpayments in the latest commit), so will move out of draft now.

Happy to also take suggestions regarding ease-of-review. Should I squash the implementation commits? Personally I tend to prefer a higher number of small review chunks rather than the other way round, but I also understand that too many commits can trigger a "the PR is too big to even look into" sensation for potential reviewers.

theStack avatar Feb 23 '24 00:02 theStack

I ended up abusing this function [secp256k1_silentpayments_create_output_seckey]

Yeah, this is basically scalar_add, with some high-level docs.

This is something we need to think about. A goal of the library is to provide safe high-level APIs. I guess we often assume that this is equivalent to avoiding access to low-level operations in the API. But as this functions shows, it's not equivalent. This function is part of a high-level API, but it happens to expose a simple low-level operation... Now, there's nothing wrong with this per se, but people will abuse this for all kinds of things.

And let's be honest. If you need a scalar_add, and the library gives you a (disguised) scalar_add, it's just pragmatic to use it. The API docs won't prevent anyone from doing so. Is it their responsibility? Probably yes.


I think the entire module, as proposed, is a bit unusual for libsecp256k1 because it only wraps some ECC operations of a higher-level protocol (as explained in the header). This design makes sense, but it's different from what we usually have in libsecp256k1.

You could say that BIP340 signatures are also just some ECC parts of a higher-level protocol "Bitcoin", but the difference is that they implement a common interface of a signature scheme, what people call a cryptographic primitive (like public-key encryption, key exchange, hashing, etc.). But this here is merely the "cryptographic core of silent payments", and it's tightly coupled to it.

Again, there's per se nothing wrong with a more low-level module, but I think we'll need to spend some thoughts on how we make it fit the library best. Sometimes the potential for abuse can be reduced by having opaque data types that you can only pass from API function to API function. We do something similar in the proposed musig2 API for example. I think this would be closest to the current spirit of the library.

Another design pattern is to have "stage" functions that perform all the different computation that should be done at a specific stage of the protocol. This doesn't seem elegant, but it prevents mistakes where the caller combines the API functions wrongly, e.g., calls them in the wrong order.


Off-topic ramblings about the library:

So if we expose the silentpayments_create_output_seckey function, we might as well go ahead and expose scalar_add and possibly some more scalar and group functions as part of a mid-level, or "hazmat API" as it's called in some libraries (ignoring for a moment that this is work and raises more questions). We anyway have ec_pubkey_combine left as a relic. Then the caller at least won't have to abuse the silent payment API.

And if we had hazmat API, then one could alternatively simply build some libsilentpayments around it, and other applications could do the same. This would certainly help some applications. But it's not obvious that this is the ideal solution. There are multiple related but different questions, e.g., in should stuff be a module, in which repo should it live, who should maintain it, how should it be linked, etc... (And lol, please don't be discouraged by my comments here. I see value in this as a libsecp256k1 module, it's just that the PR makes me think about the bigger picture again...)

real-or-random avatar Feb 28 '24 00:02 real-or-random

Thanks for chiming in @real-or-random , this is exactly the kind of review we are looking for at this stage! As much as possible, we'd like to keep this module in the spirit of the library. I'll take a look at the Musig2 API for inspiration and also spend some more time thinking about the "stages" concept. That feels somewhat closer to what we have now.

And if we had hazmat API, then one could alternatively simply build some libsilentpayments around it

This also makes a lot of sense to me as an approach. For example, I've been chatting with the developers of rust-silentpayments about how their library might evolve if there is a libsecp256k1 silentpayments module. Seems like having a rust silent payments library that consumes a "hazmat" secp256k1 API could be a reasonable solution?

And lol, please don't be discouraged by my comments here

The only thing that discourages me is no comments at all :smile:

josibake avatar Feb 28 '24 18:02 josibake

Thanks for the valuable feedback @real-or-random, glad that the interface discussion is in full swing! After thinking about it for a while, I sympathize with the "libsilentpayments" idea more and more.

So if we expose the silentpayments_create_output_seckey function, we might as well go ahead and expose scalar_add and possibly some more scalar and group functions as part of a mid-level, or "hazmat API" as it's called in some libraries (ignoring for a moment that this is work and raises more questions). We anyway have ec_pubkey_combine left as a relic. Then the caller at least won't have to abuse the silent payment API.

Doesn't secp256k1 already have a good amount of exposed routines that potentially fall into the "hazmat API" category, especially w.r.t. the tweaking routines? For example, even without a silent payments module, one could say that the disguised scalar_add you mentioned is right now available as secp256k1_ec_seckey_tweak_add (that's btw the exact routine called within secp256k1_silentpayments_create_output_seckey, so it's currently more like an alias/wrapper around an already exposed function and hence wouldn't enable something new to the user that's not possible yet). Or would the idea be to go one step further and then also expose the internal scalar/ge(j) data types to such a hazmat API, in order to avoid unnecessary conversions between public and internal data types (and avoid error checks) that potentially hurt performance?

To my understanding, secp256k1 as of now provides all the necessary routines for a full BIP352 implementation (Bitcoin Core PR https://github.com/bitcoin/bitcoin/pull/28122 followed that approach before it used this module). The two main reasons for a secp256k1 module (or library using secp256k1, if the libsilentpayments approach is followed) would be AFAICT 1) efficiency and 2) hiding cryptographic complexity from the user.

It's kind of embarrassing to only come up with this doubts weeks after starting this implementation, but I only realize now that the potential of point 1) to warrant a module was actually never verified. Assuming we would have a libsilentpayments C library that uses secp256k1 as it is now (i.e. even without any extra hazmat API). Would that even be significantly slower than using a dedicated optimized silent payments module using low-level scalar/ge(j) primitives? Without having numbers, I'd assume that the most expensive operations are by far always the involved point multiplications (e.g. the shared secret creation via ECDH), and for those having a hazmat API wouldn't change much.

@josibake: Do you think it makes sense to benchmark the Bitcoin Core PR #28122 before and after it uses this module to compare what the gains are? Happy to investigate this a bit further, if the old version (based on w0xlt's work, AFAIR) is still available somewhere. Trying Sjor's branch for building the index would probably also be interesting to run once with the "pure secp256k1" and once with the "secp256k1-silentpayments" module approach. Even if my simplified performance assumptions above are completely wrong (I wouldn't be surprised if they are :D), I think it would be good to have some benchmark numbers anyway.

theStack avatar Feb 28 '24 23:02 theStack

@real-or-random spent some time digesting your comment and wanted to summarize some thoughts:

I think the entire module, as proposed, is a bit unusual for libsecp256k1 because it only wraps some ECC operations of a higher-level protocol (as explained in the header). This design makes sense, but it's different from what we usually have in libsecp256k1.

In hindsight, I think BIP352 could (should) have been three BIPS:

  • BIP-silent-payments-non-interactive-multikey-ecdh (SPNIMKECDH)
  • BIP-silent-payments-v0 (the part that defines which inputs to use)
  • BIP-silent-payments-address-encoding

This way, if we need change how the inputs are chosen in the future, we'd write a small BIP-silent-payments-v1, and nothing about the SPNIMKECDH part would need to change. Conceptually, when talking about a libsecp256k1 module I'm only talking about SPNIMKECDH (i.e. we wouldn't expect a libsecp256k1module to filter transaction inputs to the silent-payments-v0 eligible inputs before performing the SPNIMKECDH protocol). If we start from this assumption and forget about the labels part of silent payments for a second, we could build a high-level API like so:

Sender

typedef struct {
    secp256k1_pubkey recipient_scan_pubkey,
    secp256k1_pubkey recipient_spend_pubkey,
    size_t n_desired_outputs
} secp256k1_silentpayments_recipient;

/**
 * Takes plain and taproot seckeys as inputs, and a list of the desired recipients
 * and returns a list of generated xonly outputs.
 * In this example, a recipient is a scan/spend pubkey pair along with an int to indicate
 * how many outputs should be created for this particular recipient
 * /
int secp256k1_silentpayments_sender_create_outputs(
    const secp256k1_context *ctx,
    secp256k1_xonly_pubkeys **outputs,
    const unsigned char * const *plain_seckeys,
    size_t n_plain_seckeys,
    const unsigned char * const *taproot_seckeys,
    size_t n_taproot_seckeys,
    const secp256k1_silentpayments_recipient * const *recipients,
    size_t n_recipients,
    const unsigned char *outpoint_smallest36
)

Receiver

typedef struct {
    secp256k1_xonly_pubkey pubkey,
    unsigned char output_tweak[32],
} secp256k1_silentpayments_output;

/**
 * Takes tx input pubkeys, the receivers spend pubkey and scan private key, and tx outputs as inputs
 * and returns a list of found outputs, where the found output consists of the xonly pubkey and the tweak data
 * needed to spend it
 * /
int secp256k1_silentpayments_receiver_scan_outputs(
    const secp256k1_context *ctx,
    secp256k1_silentpayments_output **found_outputs,
    const secp256k1_pubkey **input_pubkeys,
    size_t n_input_pubkeys,
    const secp256k1_pubkey *receiver_spend_pubkey,
    const unsigned char *receiver_scan_seckey,
    const secp256k1_xonly_pubkey * const *tx_outputs,
    size_t n_tx_outputs
)
/** 
 * Takes tx input pubkeys and the smallest outpoint and returns a pubkey
 * that can be stored and distributed to clients who need to scan but dont have
 * access to block data/tx data (index for wallet rescanning, light clients, etc).
 * /
int secp256k1_silentpayments_receiver_create_index_data(
    const secp256k1_context *ctx,
    secp256k1_pubkey *A_sum_tweaked,
    const secp256k1_pubkey **input_pubkeys,
    size_t n_input_pubkeys,
    const unsigned char *outpoint_smallest36
)
/**
 * takes the index data (A_sum*input_hash), tx outputs, receivers spend pubkey and scan priv key
 * and returns found outputs. Intended for light clients/wallet rescanning.
 * /
int secp256k1_silentpayments_scan_from_index_data(
    const secp256k1_context *ctx,
    secp256k1_silentpayments_outputs **found_outputs,
    const secp256k1_pubkey *receiver_spend_pubkey,
    const unsigned char *receiver_scan_seckey,
    const secp256k1_pubkey *A_sum_tweaked,
    const secp256k1_xonly_pubkey *tx_outputs,
)
/** 
 * produces a signature for a silent payments taproot output
 * this function is just a wrapper function around schnorrsign which first
 * tweaks the receivers spend key with the tweak data found during scanning.
 * this way, we avoid needing to expose any `scalar_add` functions to the caller.
 * /
int secp256k1_silentpayments_sign32(
    const secp256k1_context *ctx,
    unsigned char *sig64,
    const unsigned char *msg32,
    const secp256k1_keypair *keypair,
    const unsigned char *aux_rand32
    const unsigned char *tweak32
)

This is more of an example than a proposal to demonstrate a few simple function calls and not exposing things that could be abused or used unsafely. The main question here is "would something like this fit as a module for libsecp256k1," considering its not really a generic cryptographic primitive but also not quite the a full silent payments v0 implementation in that the sender and receiver still need to filter the inputs. If the answer is yes, I think we can start from here and figure out how to add labels in sane way (I have a few ideas for labels, both with opaque data types or stages). If the answer is no, then I think looking into other approaches, like a hazmat API is better.

I also spent some time thinking about whether or not this could be a more generic "non-interactive-multikey-ecdh" module that is used by a libsilentpayments or something, but this didn't seem super promising. Mainly because it wasn't clear that the hashing with a counter k would be generically useful, and without that we'd need to return an unhashed ECDH result. Also similar questions about whether the spend and scan key are generically useful or silent payments specific. Furthermore, we'd still need to do low-level operations even after the ECDH step (adding the spend pubkey, scanning with labels, etc), so it felt like just kicking the can down the road a bit.


@theStack

To my understanding, secp256k1 as of now provides all the necessary routines for a full BIP352 implementation (Bitcoin Core PR https://github.com/bitcoin/bitcoin/pull/28122 followed that approach before it used this module). The two main reasons for a secp256k1 module (or library using secp256k1, if the libsilentpayments approach is followed) would be AFAICT 1) efficiency and 2) hiding cryptographic complexity from the user.

I think this was more an example of "could" rather than "should." From my understanding, bitcoin core is also a bit of a "privileged" user of secp256k1 and the goal of a module would be something generally useful for wallets outside of bitcoin core. Regarding efficiency, I'd be very surprised if this approach is less efficient than what we were originally doing in core, although IIRC bitcoin core is faster when it comes to hashing? The main advantages of doing it all here w.r.t efficiency is not need to serialize/deserialize all the time from compressed keys to group elements. Regarding benchmarking, I do still have the old version not using this module, if you want to compare with that! Would be nice to see some numbers!

josibake avatar Mar 07 '24 19:03 josibake

@theStack I started refactoring the API based on https://github.com/bitcoin-core/secp256k1/pull/1471#issuecomment-1984303397. I've only completed the sender side, but will continue working on the receiver side.

The main changes are:

  1. Provide a single function for the sender: this means the sender doesn't need to worry about grouping addresses and ensuring they pick the right values for k
  2. Remove the tweaked_pubkey routine (for light clients): instead, if the caller does not pass a pointer for input_hash, then include input_hash with the returned A_sum

The branch is still a bit rough so some of the comments / commit messages are likely correct. Mostly just looking for conceptual feedback and curious if you have any objections to this approach. The branch is https://github.com/josibake/secp256k1/tree/bip352-api-refactor . I've tried to keep the edits in the relevant commits so that if the changes make sense we can easily incorporate them into your branch.

josibake avatar Mar 25 '24 18:03 josibake

@theStack I started refactoring the API based on #1471 (comment). I've only completed the sender side, but will continue working on the receiver side.

The branch is still a bit rough so some of the comments / commit messages are likely correct. Mostly just looking for conceptual feedback and curious if you have any objections to this approach. The branch is https://github.com/josibake/secp256k1/tree/bip352-api-refactor . I've tried to keep the edits in the relevant commits so that if the changes make sense we can easily incorporate them into your branch.

Thanks for pushing. A single-function API for sending that takes care of everything (even the scan-pubkey grouping, sort ftw!) seems a good idea to me. What's currently a problem, I think, is that the user doesn't have a way to determine which of the created outputs belong to which recipient. E.g. if the recipients are passed in in the order $(B1_{scan}, B1_{spend})$ and $(B2_{scan}, B2_{spend})$, the outputs could be created in the (unexpected) order $output_{B2}, output_{B1}$, if $B2_{scan}$ is lexographically smaller than $B1_{scan}$.

One solution for this might be to extend the recipient structure by the resulting x-only output key and let the function fill that out, i.e. passing the recipients as in/out-parameter?

theStack avatar Mar 26 '24 00:03 theStack

A single-function API for sending that takes care of everything (even the scan-pubkey grouping, sort ftw!) seems a good idea to me.

yeah, the more I thought about it, we are already requiring the user to pass all of the keys at once so it didn't seem like we are gaining anything with separate function calls. In the future, it might make sense to have functions to enable sending where the sender does not have access to all of the private keys at once, but that will require a lot more thought and seems out of scope for now.

the user doesn't have a way to determine which of the created outputs belong to which recipient

Good catch, I overlooked this! I took your suggestion of extending the recipient struct and updated the branch. This has the added benefit of simplifying the function signature (i.e. no longer need to pass arguments for outputs and n_outputs).


Thinking about the receiver API a bit more, I realized we can't assume the receiver has access to the tx_outputs when scanning. For example, if the client is using BIP158 block filters, they would generate the scriptPubKey and then check if it is in the filter before downloading the block, or in the case of an electrum style client, they might generate the taproot output and then call the electrum backend to see if the scriptPubKey exists in the UTXO set.

Given this, what do you think about using a callback function for checking if the generated output exists / checking if the label exists? It seems libsecp256k1 does use this pattern already in the case of the ECDH module where the caller can provide their own hash function and using callbacks could allow for a much simpler, easy to use API but I'm not what the cost in performance would be or if this would work with language bindings (e.g. a javascript library using libsecp256k1). Curious to hear your thoughts.

josibake avatar Mar 27 '24 17:03 josibake

the user doesn't have a way to determine which of the created outputs belong to which recipient

Good catch, I overlooked this! I took your suggestion of extending the recipient struct and updated the branch. This has the added benefit of simplifying the function signature (i.e. no longer need to pass arguments for outputs and n_outputs).

Ah indeed, smaller function signature is a nice benefit, especially since the number of parameters was quite high already. :+1:

As for the current version: I didn't think about this when I made the suggestion, but now having the generated outputs as pointers within the _silentpayments_recipient struct is a bit of tedious/weird interface for users, as they would have to allocate memory also for those (first, for the array of pointers, and then also for the _xonly_pubkey objects as well!), in addition to the struct itself. Do you think it's an acceptable choice if we just drop n_outputs from the struct and store the output directly as instance, i.e.

typedef struct {
    secp256k1_pubkey scan_pubkey;
    secp256k1_pubkey spend_pubkey;
    secp256k1_xonly_pubkey generated_output;
} secp256k1_silentpayments_recipient;

This should make things much easier imho, as the user only has to allocate memory for a single struct instance per recipient. It should also make the implementation slightly simpler (one loop less). I'm assuming here that creating multiple outputs to the same SP address is rather the exception than the rule. Even if it is used more often, I think repeatedly filling out the same scan/spend keypairs is not a problem for wallets.

Thinking about the receiver API a bit more, I realized we can't assume the receiver has access to the tx_outputs when scanning. For example, if the client is using BIP158 block filters, they would generate the scriptPubKey and then check if it is in the filter before downloading the block, or in the case of an electrum style client, they might generate the taproot output and then call the electrum backend to see if the scriptPubKey exists in the UTXO set.

Oh, wasn't aware of that. The two examples you state here are only applicable for the non-label case though, right (not even the change address label)? To my understanding, when scanning for labels, we always need all outputs of a tx. Is it thinkable to have two scanning interfaces, one for the simple non-label case where simply the output is returned, and one for labels, where all outputs are passed and the labels scanning is done?

Given this, what do you think about using a callback function for checking if the generated output exists / checking if the label exists? It seems libsecp256k1 does use this pattern already in the case of the ECDH module where the caller can provide their own hash function and using callbacks could allow for a much simpler, easy to use API but I'm not what the cost in performance would be or if this would work with language bindings (e.g. a javascript library using libsecp256k1). Curious to hear your thoughts.

Interesting idea, I'd intuitively assume it's neither a huge for problem for performance nor for other language bindings, but would have to check deeper.

theStack avatar Mar 28 '24 02:03 theStack

Do you think it's an acceptable choice if we just drop n_outputs from the struct and store the output directly as instance

This is much simpler. Also agree that cases where a sender is creating multiple outputs for the same recipient will likely be rare, and even in those instances it might still make sense to pass the same address multiple times (e.g. to easily match up the generated address back to the requested amount). This ended up being a bit tedious due to the fact that the generated outputs depend on the order of the recipients, but rather than hack around it here, I updated the test vectors to include all possible output sets.

Oh, wasn't aware of that. The two examples you state here are only applicable for the non-label case though, right (not even the change address label)? To my understanding, when scanning for labels, we always need all outputs of a tx.

Correct, you do need the tx outputs to scan for labels using the (more efficient) subtraction technique. Worth mentioning you can scan for labels without the tx outputs (e.g. the change label) by just adding the label to each output, but this is really inefficient as the number of labels grows.

one for the simple non-label case where simply the output is returned, and one for labels, where all outputs are passed and the labels scanning is done?

I think this makes sense. I'd imagine light clients will not support labels, whereas full nodes will, so it seems reasonable to provide two interfaces.

Still mulling over the callbacks idea, will try to have a concrete proposal over the next few days. I've updated the branch in the meanwhile with the sending changes.

josibake avatar Apr 02 '24 13:04 josibake