nearcore
nearcore copied to clipboard
Yield Execution (NEP 516 / NEP 519)
NEPs: near/NEPs#516 near/NEPs#519
The following branches contain a basic prototype of yield execution supporting the chain signatures use case:
- nearcore: https://github.com/saketh-are/nearcore/tree/yield_resume_v0
- near-sdk-rs: https://github.com/saketh-are/near-sdk-rs/tree/mpc_contract_v0
To test out the chain signatures contract:
- Build
neard
and run localnet. - Build
mpc_contract
fromnear-sdk-rs/examples
. - Create a new account
mpc.node0
on localnet:env NEAR_ENV=localnet near create-account "mpc.node0" --keyPath ~/.near/localnet/node0/validator_key.json --masterAccount node0
. - Additionally create accounts
requester.node0
andsigner.node0
. - Deploy the contract code:
env NEAR_ENV=localnet near deploy "mpc.node0" <path/to/mpc_contract.wasm>
- Submit a signature request
env NEAR_ENV=localnet near call mpc.node0 sign '{"payload" : "foo"}' --accountId requester.node0
. Observe that the request will hang. - From a separate terminal, use
env NEAR_ENV=localnet near call mpc.node0 log_pending_requests --accountId signer.node0
to see the data id for the pending request. In a real use case, the signer node will monitor the contract via indexers to get this information. - Submit a "signature" via
env NEAR_ENV=localnet near call mpc.node0 sign_respond '{"data_id":"<data id here>","signature": "sig_of_foo"}' --accountId signer.node0
- In the original terminal, observe that the signature request returns with "sig_of_foo_post".
Note that steps 6-8 are a bit time-sensitive at the moment. If the call to sign
in step 6 doesn't receive a response from step 8 within roughly a minute, you'll eventually see a message Retrying transaction due to expired block hash
.
Remaining work includes:
### Tasks
- [x] Implement timeouts
- [x] Ensure that the caller for `promise_data_yield` is the only one that can call `promise_submit_data`
- [x] Ensure that `promise_submit_data` cannot be used with data ids not created by `promise_data_yield`
- [x] Ensure the case where `yield_resume` is called within the same transaction as `yield_create` works correctly
- [x] Rework the implementation on the nearcore side to avoid creating a new Action type
- [x] `gas` argument for `promise_yield_create` should be accompanied with the `gas_weight` argument
- [x] Simplify trie state to just yield queue plus postponed action receipts
- [ ] Ensure gas costs are charged properly, including a new cost parameter for postponing a receipt
- [ ] Implement resharding logic for yielded promise queue
- [x] Determine an acceptable value for the timeout length
- [ ] Tests: feature gating works as expected
- [x] Tests: functionality works as expected (i.e. see various "ensure" steps above – all of these should be tested)
- [ ] Check if profile entries related to yield/resume need to be excluded from the profiles while this functionality ain't stable
- [ ] Charge for creation of a data receipt as part of `promise_yield_create`, seeing as one will be created by the protocol if the yield ends up timing out.
- [x] The error should be the same for all different failure modes of `yield_submit_data_receipt`
- [x] Look into "Retrying transaction due to expired block hash"
- [ ] Look into integration with the higher-level Promises API in near-sdk-rs
can we link PRs for listed tasks to this tracking issue?
I have a couple of draft PRs I'm continuing to iterate on:
- https://github.com/near/nearcore/pull/10415 for the nearcore changes
- https://github.com/near/near-sdk-rs/pull/1133 for the rust sdk changes and test contract
I don't anticipate having separate PRs for each subtask since it won't make sense to merge this until we have all the details right.
Here some thoughts I had over today as I was thinking about estimation and costs for this feature.
- The hard part is really just paying for the storage. The compute costs for handling these new operations seem straightforward and can be largely replicated from the code for the already implemented estimations.
-
promise_await_data
seems quite straightforward in isolation too;- Except when used in combination with
promise_and
: you canjoin
a bunch ofpromise_data_await
dependencies together and some of the backing structures will have to live for as long as the longestpromise_data_await
! If implemented naively, this combination can result in surprising amount of gas charged for the amount of work done.
- Except when used in combination with
- Data is written to rocksdb as a result of
promise_submit_data
but that will most likely happen deep inside the transaction runtime outskirts where gas tracking is no longer conducted, so it is necessary to account for this at the same time base action cost is charged.-
cost(promise_submit_data) = action_base + storage_base + storage_per_byte
?
-
- This data is later read out at least multiple times over before the continuation for
promise_then
is invoked.- As far as I understand the value can end up being read number of times in proportion to how long the pending continuation action’s inputs remain unresolved;
- It will still be read multiple times (but possibly a fewer number) even if the the dependencies are immediately resolved;
- But more importantly there seems to be no way to know ahead of time just how many reads of this data can occur. It doesn’t help to even if we could guarantee that receipt inputs are read out once every block production cycle. That’s because users can arbitrarily delay the resolution of the promise in the proposed design.
- It sounds like it'd be really hard to estimate this accurately and we should replace these reads outs with a presence check (which is what they do) in the first place, which would make this problem much less severe and cost estimation easier to reason about.
- I’ll try hacking on it, maybe I’ll learn something new along the way…
- Regardless,
promise_submit_data
actually sounds like the most appropriate place to charge gas for storage of the data being submitted as this is the first point in the logic flow at which both the number of blocks to store the data for and the amount of data being stored is known. - The costs here should be appreciable, so that there isn’t an economic incentive to use this as a mechanism for data storage.
I have addressed the concern of the data being read out multiple times throughout the life of an unresolved promise in the PR referenced just above. Now we only are going to do a simple check for key existence, which simplifies the cost model significantly. In particular the model now does not need to account for the period of time between when the promise_submit_data
is called and when the future gets resolved, at least in terms of compute cost. Furthermore, I hear that we're now looking at making the timeout variable constant system-wide parameter, rather than a user-controllable one, which is probably a slight simplification as well.
In my mind a correct cost model in context of these changes looks like this:
-
cost(promise_await_data) = action_base + max_timeout * is_ready_check_cost
– theaction_base
covers the compute resources to set up the action receipt, themax_timeout * ready_check_cost
part covers the compute resources of checking whether the promise is ready to go at the worst case of every block.ready_check_cost
should be roughly equivalent to a single check of key existence in the database.- Compared to the usual
promise_
operations theready_check_cost
is unique to this function, due to its giving away resolution timing control to the end users. - In practice we won’t be reading out the receipt data input state every block, but the worst case, I believe allows such scenario to happen. Our current costs for storage ops are
"storage_read_key_byte": 30952533
and"storage_read_value_byte": 5611005
. These are low enough that I can think we could afford to not worry about the refunds for unused delay in case the future gets resolved early.
- Compared to the usual
-
cost(promise_submit_data) = action_base + cost(storage_write(data)) + cost(storage_read(data)) + cost(storage_for_max_timeout(data))
- This model assumes we can only
submit_data
successfully once. - We charge whatever fees for storing the data for up-to maximum timeout here, in case the
await_data
this is is paired with is part of apromise_and
and cannot be immediately resolved. - This is a departure from the usual staking-for-storage model, charging gas to store data is entirely new concept for us... Can we somehow get the contract stake near just like it would need to for regular storage operations?
- I wrote down my thoughts on this here onwards
- This model assumes we can only
-
Upon promise resolution: refund a difference between
cost(storage_for_max_timeout(data))
andcost(storage_for_n_blocks(data, actual_blocks_of_delay))
.- Do we refund the difference for unused storage time once the
await_data
continuation is executed at all? Whom do we refund to -- the contract? the person who originally invoked the function call that then executedpromise_submit_data
? How do we ensure we use the same gas:near price ratio as was used when creatingsubmit_data
? - If we do not refund the storage fees, there's no incentive to resolve all pending receipts ASAP, but it simplifies the implementation of the validator significantly.
- Do we refund the difference for unused storage time once the
gas
argument for promise_yield_create
should be accompanied with the gas_weight
argument. I added this to the task list.
Status update @walnut-the-cat: Fixed-length timeouts are implemented now. Work continues on gas costs and bounding congestion (mainly thanks @nagisa), as well as on the misc. smaller implementation details documented on this tracking issue.
Very excited about this idea! Thank you, contributors