NEPs icon indicating copy to clipboard operation
NEPs copied to clipboard

Support for yielded execution

Open akhi3030 opened this issue 1 year ago • 16 comments

akhi3030 avatar Nov 17 '23 12:11 akhi3030

Your Render PR Server URL is https://nomicon-pr-519.onrender.com.

Follow its progress at https://dashboard.render.com/static/srv-clbm2n54lnec73ashvhg.

render[bot] avatar Nov 17 '23 12:11 render[bot]

At the protocol working group meeting yesterday, different members expressed concerns on these new runtime APIs. I'll attempt the summarize them below:

  • There is nothing that stops someone from emulating the API in the sdk by saving the promise in the state and triggering it again once it is ready to be resumed. The call stack is discontiguous but that could potentially be fixed on the tooling level and it seems that Aurora already does this.
  • Whether or not the delayed promise has an expiration time, the introduction of this API could lead to a new type of failure that does not previously exist where the contract execution fails with no way to clean up the state. It could be fine for the chain signature use case, but if some (potentially malicious) developer uses this in a defi contract for oracles (for example), then it is possible that the contract call stalls/fails in the middle, which results in users losing access to their fund.
  • The API does not specify who can call "resume". This means that a malicious attacker can just call resume with invalid data to make a transaction fail. While the attacker does need to spend gas themselves, it is quite cheap. This allows for a relatively easy DoS attack, especially given that the gas that a user needs to spend may be larger than what an attacker needs to spend to cancel the transaction. In addition, failure in a chain of calls A -> B -> C may also lead to inconsistent state in one of the smart contract involved.

cc @mm-near @birchmd @mfornet

bowenwang1996 avatar Jan 06 '24 17:01 bowenwang1996

* There is nothing that stops someone from emulating the API in the sdk by saving the promise in the state and triggering it again once it is ready to be resumed. The call stack is discontiguous but that could potentially be fixed on the tooling level and it seems that Aurora already does this.

This main motivation for this work is the MPC project. If we can satisfy the requirements without having to change the protocol, that much better.

Can you provide some links to how aurora does this? Maybe then @DavidM-D and @itegulov can review them.

* Whether or not the delayed promise has an expiration time, the introduction of this API could lead to a new type of failure that does not previously exist where the contract execution fails with no way to clean up the state. It could be fine for the chain signature use case, but if some (potentially malicious) developer uses this in a defi contract for oracles (for example), then it is possible that the contract call stalls/fails in the middle, which results in users losing access to their fund.

I am not sure I completely understand this point. Especially about not having a way to clean up the state. If things expire, the protocol will call the contract. So is the contract may not handle that case properly? Isn't that similar to how the current Promise API works?

* The API does not specify who can call "resume". This means that a malicious attacker can just call resume with invalid data to make a transaction fail. While the attacker does need to spend gas themselves, it is quite cheap. This allows for a relatively easy DoS attack, especially given that the gas that a user needs to spend may be larger than what an attacker needs to spend to cancel the transaction. 

If contract A called yield_create, only contract A can call resume on that promise. No other contract can.

In addition, failure in a chain of calls A -> B -> C may also lead to inconsistent state in one of the smart contract involved.

Can you please expand on this? If A is calling B, then B can fail for any number of reasons today, (it can trap, run out of gas, have invalid state). Is the problem that we are introducing a new failure mode now that B has to handle?

akhi3030 avatar Jan 08 '24 13:01 akhi3030

I am not sure I completely understand this point. Especially about not having a way to clean up the state. If things expire, the protocol will call the contract. So is the contract may not handle that case properly? Isn't that similar to how the current Promise API works?

I misread the proposal and therefore thought incorrectly that the execution would just fail at that point.

If contract A called yield_create, only contract A can call resume on that promise. No other contract can.

Yes, I guess the concern here is that a contract developer can shoot themselves in the foot if they are not careful with how they valid the computation result that is passed to yield_resume because while only the contract itself can resume the computation. Anyone can call the method (signature_available in this case) that provides the computation result. If there is no proper validation of the result, then a malicious attacker can easily submit something that causes the original transaction to fail. I don't think this is a concern for MPC signature per se since that contract will be carefully written and audited, but still worth noting for the general use case.

bowenwang1996 avatar Jan 10 '24 04:01 bowenwang1996

Yes, I guess the concern here is that a contract developer can shoot themselves in the foot if they are not careful with how they valid the computation result that is passed to yield_resume because while only the contract itself can resume the computation. Anyone can call the method (signature_available in this case) that provides the computation result. If there is no proper validation of the result, then a malicious attacker can easily submit something that causes the original transaction to fail. I don't think this is a concern for MPC signature per se since that contract will be carefully written and audited, but still worth noting for the general use case.

Yes, this is always possible. Luckily, we have some nice tools (https://github.com/aurora-is-near/near-plugins/blob/master/near-plugins/src/access_controllable.rs) available that could be used.

akhi3030 avatar Jan 10 '24 14:01 akhi3030

@near/nep-moderators: I believe this PR should be ready for review.

akhi3030 avatar Feb 19 '24 17:02 akhi3030

Thank you @akhi3030 and @saketh-are for submitting this NEP.

As a moderator, I reviewed the NEP meets the proposed template guidelines, so I am moving this NEP to the REVIEW stage.

I would like to ask the @near/wg-protocol working group members to assign 2 Technical Reviewers to complete a technical review (see expectations below).

Just for clarity, Technical Reviewers play a crucial role in scaling NEAR ecosystem as they provide their in-depth expertise in the niche topic while work group members can stay on guard of the NEAR ecosystem. The discussions may get too deep and it would be inefficient for each WG member to dive into every single comment, so NEAR Developer Governance designed this process that includes subject matter experts helping us to scale by writing a summary with the raised concerns and how they were addressed.

Technical Review Guidelines
  • First, review the proposal within one week. If you have any suggestions that could be fixed, leave them as comments to the author. It may take a couple of iterations to resolve any open comments.

  • Second, once all the suggestions are addressed, produce a Technical Summary, which helps the working group members make a weighted decision faster. Without the summary, the working group will have to read the whole discussion and potentially miss some details.

Technical Summary guidelines:

  • A recommendation for the working group if the NEP is ready for voting (it could be approving or rejecting recommendation). Please note that this is the reviewer's personal recommendation.

  • A summary of benefits that surfaced in previous discussions. This should include a concise list of all the benefits that others raised, not just the ones that the reviewer personally agrees with.

  • A summary of concerns or blockers, along with their current status and resolution. Again, this should reflect the collective view of all commenters, not just the reviewer's perspective.

Here is a nice example and a template for your convenience:


### Recommendation

Add recommendation

### Benefits

* Benefit

* Benefit

### Concerns

| # | Concern | Resolution | Status |

| - | - | - | - |

| 1 | Concern | Resolution | Status |

| 2 | Concern | Resolution | Status |

Please tag the @near/nep-moderators once you are done, so we can move this NEP to the voting stage. Thanks again.

walnut-the-cat avatar Feb 21 '24 15:02 walnut-the-cat

As a working group member, I'd like to nominate @birchmd and @nagisa as subject matter experts to review the NEP.

bowenwang1996 avatar Feb 26 '24 19:02 bowenwang1996

One thing I have a hard time wrapping my head around it the necessity for the "Reference Implementation" section

I agree, I also found it a bit difficult to figure out how much details to present here. Inspired by this NEP I think it is fine to just provide a link to the PR. I will make the change to this PR now. And let the SMEs decide if they are happy with the change or not.

akhi3030 avatar Feb 28 '24 14:02 akhi3030

@birchmd, @nagisa: thank you very much for the review. I believe I have addressed all the feedback that you provided. There is an ongoing discussion on the edge case around when a timeout has happened but the callback has not executed and the contract calls resume. Once we resolve the discussion, I will update the spec. But this should be a minor change. So I believe that this should be ready for you to look once again.

akhi3030 avatar Feb 28 '24 18:02 akhi3030

NEP Status (Updated by NEP Moderators)

Status: Approved

SME reviews:

Protocol Work Group voting indications (❔ | 👍 | 👎 ):

  • 👍 @bowenwang1996 https://github.com/near/NEPs/pull/519#issuecomment-1982019453
  • 👍 @birchmd https://github.com/near/NEPs/pull/519#pullrequestreview-1909863557
  • 👍 @mfornet https://github.com/near/NEPs/pull/519#issuecomment-1982224099
  • 👍 @mm-near https://github.com/near/NEPs/pull/519#issuecomment-1983987776

victorchimakanu avatar Feb 29 '24 02:02 victorchimakanu

As the moderator, I would like to thank the author @akhi3030 for submitting this NEP extension, and the work group members for your review. Based on the voting indications above, it seems like this proposal is close to a decision. We are therefore scheduling the 8th Protocol Work group meeting next week.

Anyone can discuss the technical details by adding your comments to this discussion. And join the call to learn about the final decision and how to get more involved.

Meeting Info:

📅 Date: Friday, March 8th, 4 PM UTC ✏️ Register here

victorchimakanu avatar Mar 01 '24 10:03 victorchimakanu

As a working group member, I lean towards accepting this proposal. It is a great step towards enabling smart contracts to interact with offchain computation while still preserving the call stack.

bowenwang1996 avatar Mar 06 '24 23:03 bowenwang1996

As a working group member, I lean toward approving this NEP.

This NEP solves a problem that can't be solved today, which is having simultaneously:

  • Delayed responses
  • Guaranteed execution of callback
  • Abstraction from the caller about the delayed response.

Delayed responses are particularly useful to seamlessly use off-chain computation on-chain. This could involve for example oracle queries (eg fetch_tweet(tweet_id)) or privacy protocol execution such as signing some data offchain.

mfornet avatar Mar 07 '24 02:03 mfornet

As a working group member, I lean toward approving this NEP.

  • one small comment: in comments you say:
/// `data_id_len` and `data_it_ptr`: Used to pass the unique resumption token
/// that was returned to the smart contract in the `promise_yield_create()`
/// function.

But promise_yield_create seems to be returning u64

mm-near avatar Mar 07 '24 16:03 mm-near

* one small comment: in comments you say:
/// `data_id_len` and `data_it_ptr`: Used to pass the unique resumption token
/// that was returned to the smart contract in the `promise_yield_create()`
/// function.

But promise_yield_create seems to be returning u64

Yeah, the comment is referring to the resumption token returned in the register. I'll update the comment to address this.

akhi3030 avatar Mar 07 '24 17:03 akhi3030

can we merge this into master?

walnut-the-cat avatar Jun 17 '24 17:06 walnut-the-cat

@near/nep-moderators: please review and approve the PR. The work was just released to the mainnet.

akhi3030 avatar Jun 17 '24 17:06 akhi3030