semaphore icon indicating copy to clipboard operation
semaphore copied to clipboard

Remove sub() from gas() in SemaphoreVerifier.sol

Open jimmychu0807 opened this issue 1 year ago • 5 comments

There are lines in SemaphoreVerifier.sol that make gas() calls. GAS opcode is restricted in validateUserOp (along with some other OPCODES). GAS is only allowed if immediately followed by CALL (and similar) opcodes. Semaphore does staticcall(sub(gas(), ...) and there is a SUB in between.

So we need some other way to take the gas cost in consideration.

  • related: #345
  • refer also to this blog post (other challenges section): https://saleel.xyz/blog/zk-account-abstraction/

jimmychu0807 avatar Oct 15 '24 10:10 jimmychu0807

@cedoor this is the issue related to #345.

jimmychu0807 avatar Oct 15 '24 10:10 jimmychu0807

@jimmychu0807 thanks 👍🏽

cedoor avatar Oct 15 '24 10:10 cedoor

@saleel currently what is the simplest way to setup a test environment to trigger the issue that validateUserOp() calls SemaphoreVerifier.sol and fails due to having a sub call before gas call (without moving heaven and earth 🙂)?

If this can't be done, I can't reliably know that I have solved this issue.

Does this mean I should get back to setting up the smart-account-module?

jimmychu0807 avatar Oct 16 '24 11:10 jimmychu0807

Yea, you would need to run the bundler and try to do a userOp with semaphore proof as the verifier. Basically, get the tests in https://github.com/saleel/semaphore-wallet pass with the original Semaphore contract (currently it uses the custom ones I have over there)

saleel avatar Oct 17 '24 13:10 saleel

@cedoor replying to your https://github.com/semaphore-protocol/semaphore/issues/345#issuecomment-2429083407

To fix this issue, I want to be able to reproduce the error by running the test cases in https://github.com/saleel/semaphore-wallet . But the bundler component needed in the e2e test has updated since then and I have a hard time making all components connect together to trigger the error.

The opcode restriction is outlined at [OP-012]: https://eips.ethereum.org/EIPS/eip-7562#opcode-rules

An alternative is just that I make line from:

success := staticcall(sub(gas(), 2000), 7, mIn, 96, mIn, 64)

to

success := staticcall(gas(), 7, mIn, 96, mIn, 64)

will that cause any security implication?

jimmychu0807 avatar Oct 22 '24 12:10 jimmychu0807

@jimmychu0807 @cedoor

Let me add some comments and explain.

Here, sub(gas(), 2000) ensures that 2000 gas units are reserved for the remaining execution after the staticcall. This is a common practice to prevent the called function from consuming all available gas, which could potentially leave insufficient gas for subsequent operations.

In the proposed change, success := staticcall(gas(), 7, mIn, 96, mIn, 64), the staticcall will have access to all remaining gas, potentially consuming the entire gas stipend allocated for the transaction.

Typically, reserving gas helps prevent/reduce reentrancy attacks by ensuring that certain operations can always complete. However, in this specific context:

  • The staticcall targets precompiled contracts (addresses 6 and 7), which are trusted and do not contain reentrant logic (as far I can see).
  • And, static calls do not allow state modifications, further reducing the risk of reentrancy.
  • The functions following the staticcall in this context are minimal and primarily involve memory operations and checks. In most cases, they do not require substantial gas.

I suggest to proceed with the change. Replacing sub(gas(), 2000) with gas() is unlikely to introduce security vulnerabilities in the current context.

However, if future modifications introduce more gas-intensive operations after the staticcall, not reserving gas could lead to unexpected failures.

Few more additional points:

  • monitor the gas consumption of the verifyProof function after making this change
  • test the modified function with various inputs and gas limits to ensure it behaves correctly under different conditions

thogiti avatar Oct 23 '24 11:10 thogiti

@thogiti thank you for the explanation! With your comment, I take a deeper in the code. And I agree with what you said.

  • L#575: this is calling ethereum precompile at address 0x7, elliptic curve scalar multiplication. Yes, it is a pure/view function.

  • L#585: This is elliptic curve point addition.

  • L#665: This is doing a ecc point pairing, and is pretty much at the end of the function.

Let me make the change and check the difference in gas usage. Thank again for the helpful input!

jimmychu0807 avatar Oct 23 '24 14:10 jimmychu0807

@jimmychu0807 I agree that staticcall with gas() call directly is pretty safe, given it does not allow state modifications. that being said, if we really want to preserve 2000 gas, have we tried to use gasleft() instead of gas()? gas() is a low-level EVM opcode, while gasleft() is a higher level Solidity built-in function. Using gasleft() allows the compiler to handle the restriction compliance.

0xDatapunk avatar Oct 23 '24 14:10 0xDatapunk

@0xDatapunk thanks for the feedback. The issue is that we are in the yul assembly. If we call gasleft() we have to interlace yul and regular solidity code together. I am not sure if this worths the hassle.

jimmychu0807 avatar Oct 23 '24 14:10 jimmychu0807

@jimmychu0807 do you minding experimenting calling gasleft() inside of assembly, and let me know what happens? thanks

0xDatapunk avatar Oct 23 '24 15:10 0xDatapunk

gas in yul is the equivalent of gasleft (from here) image

For the 0x06 precompile (ecAdd) the following note is included:

The gas cost is fixed at 150. However, if the input does not allow to compute a valid result, all the gas sent is consumed.

Similar logic exists for 0x07 and 0x08.

Given this restriction preserving some gas seems necessary if we want to handle failure cases. That is, if when verifying an invalid proof we want to return false instead of reverting. If this change is made we should have tests that hit the failure case of each precompile to observe/document the behavior.

Another approach could be hardcoding the gas sent to the precompile. The cost is dynamic based on the bit structure of the input so we could hardcode it to the upper bound of gas needed.

chancehudson avatar Oct 23 '24 16:10 chancehudson

Thanks for all the feedback!

PR: https://github.com/semaphore-protocol/semaphore/pull/883

@jimmychu0807 Please, check the previous comment https://github.com/semaphore-protocol/semaphore/issues/871#issuecomment-2432856752. The PR should include some other tests to be sure the function doesn't fail.

Another approach could be hardcoding the gas sent to the precompile. The cost is dynamic based on the bit structure of the input so we could hardcode it to the upper bound of gas needed.

Cool, this should also work!

cedoor avatar Oct 23 '24 17:10 cedoor

@0xDatapunk

@jimmychu0807 do you minding experimenting calling gasleft() inside of assembly, and let me know what happens? thanks

A compiler error:

DeclarationError: Function "gasleft" not found.
  --> contracts/base/SemaphoreVerifier.sol:62:43:
   |
62 |                 success := staticcall(sub(gasleft(), 2000), 7, mIn, 96, mIn, 64)
   |                                           ^^^^^^^

jimmychu0807 avatar Oct 24 '24 02:10 jimmychu0807

oops, then have to do something like:

   uint256 gasToUse = gasleft() - 2000;
   assembly {
       let gasForCall := gasToUse
       success := staticcall(gasForCall, 7, mIn, 96, mIn, 64)
   }

which u don't like.

0xDatapunk avatar Oct 24 '24 03:10 0xDatapunk

the opcode rules disallow using gas like this (gasleft should compile to a gas call)

chancehudson avatar Oct 24 '24 03:10 chancehudson

I take another (deeper) look at the SemaphoreVerifier.sol.

First on L#89 calling g1_mulAccC(), it is passing:

  1. _pVk: 64 bytes containing the 4th and 5th entries of vkPoints.
  2. the 6th (192/32) entry of vkPoints
  3. the 7th (224/32) entry of vkPoints
  4. the 0th entry (0) of pubSignals.

Inside g1_mulAccC() function, L#62 will always NOT fail. This is because the first two parameters, x and y, are valid constants included the contracts, vkPoints, with s (the pubSignal input from user) can be any 32-byte value.

Then pR used in L#69 and L#70, which is the _pVk above, are coming from vkPoints as well.

This logic applies to three others g1_mulAccC() calls in L#91, L#98, L#105.

Hmm... what I want to say is it may not be that straightforward to make L#62 and L#72 to revert.

Let me ponder more on this.

jimmychu0807 avatar Oct 24 '24 10:10 jimmychu0807

Inside g1_mulAccC() function, L#62 will always NOT fail. This is because the first two parameters, x and y, are valid constants included the contracts, vkPoints, with s (the pubSignal input from user) can be any 32-byte value.

I looked at this a bit and it seems like you're right. If the codepath cannot revert then i think it's okay to not test the revert. e.g. we just remove the sub call and add a comment saying a revert is unreachable.

On a somewhat related note this looks like a good opportunity for optimization. We could remove a bunch of mload and add calls. This should happen in a separate issue though.

chancehudson avatar Oct 26 '24 22:10 chancehudson

please refer to the updated #883.

jimmychu0807 avatar Oct 27 '24 06:10 jimmychu0807