flow Add FLIP for removing public resource fields

Add FLIP for removing public resource fields

Open dsainati1 opened this issue 2 years ago • 21 comments

This adds a FLIP proposing to remove public fields on resources in Cadence. See more discussion here: https://github.com/onflow/cadence/issues/1322

For contributor use:

[X] Targeted PR against master branch
[X] Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
[ ] Updated relevant documentation
[X] Re-reviewed Files changed in the Github PR explorer
[X] Added appropriate labels

Dec 22 '21 16:12 dsainati1

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/onflow/flow-docs/HEqcyMZb2Lt8Fvm1329Sazrcadja
✅ Preview: Failed

[Deployment for 61f5385 failed]

Dec 22 '21 16:12 vercel[bot]

What will happen to existing code that already have public fields in them. Also you say resources in the description so structs are fine right? I tend to return readOnly structs of my resources in contracts that have public fields and I want to be able to continue doing that.

Jan 09 '22 22:01 bjartek

What will happen to existing code that already have public fields in them?

This is a great question, and probably the largest technical challenge in implementing something like this: we would need to have a migration path to allow people to rewrite existing code with public resource fields.

Also you say resources in the description so structs are fine right?

Yes, structs would not be changed as part of this proposal.

Jan 10 '22 15:01 dsainati1

I am against this as you know.

Problem for me is 2 sided, first one is when we remove public fields, we are giving away too much creative choice to the developer. It means there can be a lot of different implementations which can be correct in their own way. And this will lead to very wide range of implementations. I think it is always nice to have a single straight forward way to implement something.

From developer perspective (with example on the FLIP) :

We have few options for backing storage for variable:

Private Variable ( like in FLIP ) priv var X: Int : one does't make any sense, as you cannot add fields to resources.
Dictionary {String:AnyStruct} for example : priv var data: { String: AnyStruct }
Another Contract / Contract private member or dictionary: MyAwesomeContract.fields etc

If we think all of them, first

We also have multiple options for query:

pub fun getX(): Int - doesn't make sense, considering Metadata etc availability.
pub fun getField(name:String): AnyStruct? - super flexible
pub fun getData(): AStructReturnsAccessibleFields - easier access with . to members

Let's say I implemented:

 priv var data: { String: AnyStruct } 
 pub fun getField(name:String): AnyStruct?

So I made this future proof as I can add more fields ( which I cannot add to resources currently. spoiler alert: future footgun here)

or worse:

 priv var id: UInt64
 pub fun getField(name:String): AnyStruct?{
    return MyAwesomeContract.getField(name, self.id)
 }

Oh actually now I can even change/destroy the NFT user has, all I need to do is make a mapping function from id -> real_id.

same for getData(): AStructReturnsAccessibleFields

PS: I think this direction is taking away ownership of resource from User and giving total ownership to the developer.

We already ended up with NFTs with only id field on mainnet. This change is pushing on that direction imo.

Jan 10 '22 17:01 bluesign

With #703 and a potential follow-up proposal for nested composite values, which improves the safety aspect for resources and their nested data, I do not see a strong reason to prevent the declaration of public fields in resources.

Jan 13 '22 22:01 turbolent

@bluesign

we are giving away too much creative choice to the developer. It means there can be a lot of different implementations which can be correct in their own way. And this will lead to very wide range of implementations. I think it is always nice to have a single straight forward way to implement something.

Agreed! I like the declarative aspect of fields that can be read. Both their declaration and access is straight-forward, where as this proposal could lead to developers having to come up with functions that properly allow read access to the nested data, potentially getting it wrong.

We have few options for backing storage for variable Private Variable [...] one doesn't make any sense, as you cannot add fields to resources.

We would of course still allow private variables, and the current limitation for code updates to not allow adding fields should not play a role here.

I think I follow what you are trying to point out with the remaining part of your message: Given that the removal of public fields will require an alternative, this is is going to be "accessor functions", which, instead of fields, have the potential to be updated. This would lead developers to prefer not actually storing data in resources but in a central ledger, and ultimately might be even abused for "rug pulls".

This is a really great argument and I agree with it fully! This seems to be a very strong argument against

Jan 13 '22 22:01 turbolent

The way I see it, there are two primary reasons to consider removing public fields, both of which are touched on above, but allow me to reiterate them for clarity.

Public fields in interfaces constrain implementations. For example, our current Fungible Token (FT) interface requires that the balance is a public field. This actually makes it somewhat intractable to implement something like Ampleforth which internally uses a global "exchange rate" between an internal unit of account and the external amount seen outside the smart contract. This allows the Ampleforth contract to update a single exchange rate and then proportionally adjust the account balances of every user without literally iterating over each account balance.

I don't think it's hard to imagine there might be other scenarios where it's useful to have some code that runs when the balance is queried, but requiring that the balance field is a public data member makes this impossible.
Public fields leak access in surprising ways. We've fixed the worst of these problems with #703, but the problem remains for user-defined types. The simplest example is for a Vault:
```
resource Foo {
   pub let FlowToken.Vault
}
```
Anyone who has a reference to a Foo instance can take any number of tokens out of that Vault. Not obvious, IMHO.

The arguments against removing public fields are also two-fold, it seems:

Public fields in interfaces constrain implementations. Yup! The exact same argument in reverse. As @bluesign points out, constraining implementors has its advantages! In the context of blockchains, there is always the possibility that it's the smart contract developers themselves that are bad actors.

I don't find this argument compelling. Early in the design of Cadence, we made the decision that it is more valuable to give power to honest smart contract authors than to put up road-blocks in front of nefarious smart contract authors. The problem is that preventing bad actors effectively means applying an extraordinary number of constraints. Constraints that would stymie honest ideas that hadn't been foreseen.

Adding this particular design constraint doesn't stop a bad actor, anyway! I don't need to adjust the balance field of your Vault to make your coins worthless. I just need to mint a billion new coins into my own account. (Plus a dozen other attacks I'm sure we could think of, as the smart contract author.)

In the end, ensuring that smart contracts are publicly available and easily reviewed for correctness is the best protection we have against nefarious smart contract authors.
It's convenient. I completely agree with this, but Cadence is not like other languages. The amount of Cadence code written for any given project is likely to be something between 1/10th to 1/100th of the overall code for the project, or even less. Convenience must take a back seat to clarity and safety.

Honestly, if we had a simple solution to the "leaky access" problem, then there wouldn't be any tension between convenience and clarity. But the only solution I can see to "leaky access" is implementing mutability tracking, like Rust or C++, and I my own experience with C++ taught me that proper "const correctness" is anything but convenient!

If we require that all Resource data members are private, we force smart contract developers to do the following:

Explicitly decide (in a way that is obvious to auditors and other reviewers) which data fields are readable, and which are writable.
Be very intentional about what interfaces they expose as readable or writable (using attenuated references, i.e. &{Balance, Reciever} instead of concrete types Vault).

Don't those sound like things we want everyone writing smart contracts to think about? 😉

Jan 25 '22 02:01 dete

I think we can agree that discussion here is not on technical level, more it is decision on philosophy of the Flow. I mentioned this before in Discord that there is a lack of basic ruleset ( I mentioned that as half jokingly "Flow Constitution" ) of ideals.

Let's get out of technical restrictions etc for a moment. What is Cadence? Let's read from docs.

The Cadence Programming Language is a new high-level programming language intended for smart contract development.

Ok so far, we can summarize it as : Cadence -> Smart Contract Language

It is not for me Smart EULA Language.
Contract if we draw parallels from real world is by definition: Set of rules: that are in the benefit of both parties, which can be changed only when both sides agree. So Flow ( and so far Cadence ) should be something like a notary acting between User and Developer.

As we have upgradeable contracts, it is very tricky to balance power between User and Developer.

The language's goals are, in order of importance: Safety and security: Provide a strong static type system, design by contract (preconditions and postconditions), and resources (inspired by linear types). Auditability: Focus on readability: Make it easy to verify what the code is doing, and make intentions explicit, at a small cost of verbosity. Simplicity: Focus on developer productivity and usability: Make it easy to write code, provide good tooling.

I think those goals are amazing, even though a lot of cadence developers lack ability to "design by contract", resources are granting some freedom to Users.

Early in the design of Cadence, we made the decision that it is more valuable to give power to honest smart contract authors than to put up road-blocks in front of nefarious smart contract authors.

This is in my opinion a bit "Focus on developer productivity and usability" vs "Safety and security".

(Plus a dozen other attacks I'm sure we could think of, as the smart contract author.)

Let's think in a way that: Contract owner can do hostile action, no point to protect against that. ( Though I don't agree )

Now we lost one of the selling points of Cadence: Composability. How can you use composability when you are in untrusted environment?

I am all in for freedom, but by definition, contracts and interfaces are to define restrictions. Maybe it is better to not be able to implement everything in Flow ( also in Cadence ), but implement things we can in the best way.

For me not being able to implement Ampleforth is small price to pay if it means we will be able to provide Users security about their holdings. ( or security against rug pulls )

If we require that all Resource data members are private, we force smart contract developers to do the following:

Explicitly decide (in a way that is obvious to auditors and other reviewers) which data fields are readable, and which are writable.

Be very intentional about what interfaces they expose as readable or writable (using attenuated references, i.e. &{Balance, Reciever} instead of concrete types Vault).

I agree those are great things to have. But I just don't agree the way it should be with black box resources.

Removing constant fields without any replacement is very dangerous imo.

Not the best example but, when I see some resource:

```A.0b2a3299cc857e29.TopShot.NFT(uuid: 32849189, id: 10442766, data: A.0b2a3299cc857e29.TopShot.MomentData(setID: 26, playID: 1066, serialNumber: 14866))````

I know it is minted by: '0b2a3299cc857e29' which is official TopShot Account. With id: 10442766, and setID: 26, playID: 1066, serialNumber: 14866 ( Actually it would be much better, if it included MomentDescription )

versus:

A.0b2a3299cc857e29.TopShot.NFT(uuid: 32849189)

First one is set to stone: I own serial 14866 of the moment. You can if you want later mint 1000000 of this moment, I still own the 14866. ( This reminds me on book market, market can even value new 1 million prints as worthless while giving much value on the first edition print (first 15000 moments for example ) )

Don't we want Users to have some power of negotiation in the table against Developers ?
Is it not better to push Developers to security review and/or certification ? So they can prove their Users that they are trustworthy ?
Wouldn't it be better to educate 'Usersto invest inSecurity Reviewed by 3rd partytrustworthyContracts/Projects`?

Jan 25 '22 12:01 bluesign

To me, it comes down to the following: Can someone trust an FT or NFT asset without knowing that it has been reviewed by a capable Cadence developer (possibly oneself)?

To me the answer is, every single time, "No. Unequivocally, no."

Should it be the case that a "good" NFT contract uses a constant variable for the ID? Yes! Absolutely. I agree that's what should be done, and I think anyone reviewing an NFT implementation should be very skeptical if they see that's not the case! But even if we enforce that at the language level, and with the definition of the NFT spec, that doesn't mean you can trust every NFT implementation. I don't believe we should lock down the metadata representation for NFTs, for example, because I think there are some really useful and interesting ways to have dynamic metadata on NFTs. But that means that buyers of NFTs need to know the creator isn't going to use that capability to cause harm.

And if the smart contract author can change all of the other data associated with the NFT, I'm not sure how valuable it is to make sure they can't mess with the ID.

The reason I used the Ampleforth example isn't because I think that Ampleforth is a critical piece of infrastructure that Flow needs to have, but because it shows the dangers trying to presuppose all the "right ways" to think about a problem. They changed the way that people thought about how balanceOf could work, and lots of people found that new approach valuable. It would have been impossible if ERC-20 worked by directly exposing the balance lookup table.

I consider the bar for "blocking behaviour" very high. And I don't think making the balance and id fields of the FT and NFT specs (respectively) buys us very much security. The security of the system (as I know @bluesign knows very well!) is the weakest link, and these fields are very far from being the weakest link in the world of token rugpulls.

Here's an interesting thought experiment. Let's say we changed the FT and NFT interfaces to use methods for balance and id respectively. We could do that without changing the language. What other arguments would there be against removing public data fields on resources?

Feb 14 '22 19:02 dete

@dsainati1 What's the status of this PR? Should it be archived or is this still in play?

May 19 '22 00:05 pgebheim

I don't think we have reached consensus here, so I don't feel comfortable archiving it.

May 19 '22 14:05 dsainati1

Over the next bit we will be working on moving FLIPs to their own repo. We can merge this as draft and continue discussion in the forums if you'd like to keep it open past the end of next week.

How does that sound?

May 19 '22 15:05 pgebheim

Yea that sounds good to me

May 19 '22 15:05 dsainati1

Thank you for your feedback on the rug-pull concern @dete, it seems like it is not as big a concern as we assumed.

What other arguments would there be against removing public data fields on resources?

The proposal lists a couple drawbacks:

Increase in boilerplate: field definition + getter (+ potential setter)
Breakage of a large portion of existing code

Another drawback not yet mentioned (AFAICS) is that this adds a special rule to the language, which developers have to learn and might stumble upon ("huh? I can declare a public field in a struct or contract, but in a resource I cannot? Why?"). This also needs a special case in the implementation. It is small, but still, ideally we want to avoid adding many special cases to the language.

With #703 accepted, implemented, and soon to be released as part of the Secure Cadence release in the next spork, the main concern, allowing accidental mutation of nested values, is gone.

Maybe to reverse the question: What arguments are left to remove public fields? Are they worth the drawbacks?

If standards and contracts opt to use public getter functions instead of exposing public fields, they can. But do we want to unnecessarily burden developers by requiring them to use public getter functions, even when a simple field would be sufficient?

May 19 '22 23:05 turbolent

Will this also effect synthetic fields in the future? Or simply if we implemented synthetic fields already, doesn't it make this FLIP unnecessary ?

May 20 '22 05:05 bluesign

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
flow-docs	❌ Failed (Inspect)		May 20, 2022 at 2:04PM (UTC)

May 20 '22 14:05 vercel[bot]

Synthetic fields were proposed in the original language spec, but have so far not been implemented. They are somewhat related: I guess if we were to implement synthetic fields and this proposal gets accepted, then we should also require synthetic fields in resourced to be non-public.

May 20 '22 20:05 turbolent

Yeah it would be little weird to having synthetic field and then accessing not public I guess. ( in the end it is getter/setter function combo )

Though if we allow public synthetic fields, then one step ahead is some decorator to auto convert private fields to public synthetic ones. (Virtually)

May 20 '22 21:05 bluesign

@dsainati1 Could you please provide a summary of the current discussion, i.e. what the current sentiment is, and if and what any outstanding questions are? Thanks!

Jul 22 '22 20:07 turbolent

This FLIP seems to have a majority opinion against implementing it, but there are compelling arguments on both sides.

@bluesign and @turbolent have outlined concerns about how removing public resource fields will result in too many differing ways to implement the same kind of resource logic, making using them more difficult and potentially adding a safety footgun wherein bad actors can implement malicious accessor functions, or change the behavior of previously working accessors.

@dete has pointed out, however, that the by constraining developers into using uniform implementations for their resources, design paradigms that make heavy use of public fields also necessarily decrease the power that a good actor has when writing resource implementations. Specifically he cites the example wherein Flow's FT contract requires its balance to be a public field, which prevents implementors from having something like an exchange rate in their token that automatically adjusts its value according to an external factor.

He also cites mutability concerns, wherein public fields, especially those that contain user-defined types (but also any field through a reference), can be mutated unexpectedly, and that removing public resource fields would defend against this footgun.

This suggests to me that there are two primary points of discussion on this FLIP: 1) the design question of whether or not we want resources to be more like collections of public data without much implementation logic, or whether we want resources to be black boxes that implement any behavior they choose, and 2) whether having public fields on resources is a security concern because they are unexpectedly mutable.

It seems as though the prevailing (but not universal) sentiment regarding (1) is that developers would prefer resources to be more transparent in terms of what data can be read. On (2) however, there is less consensus. An open question then, perhaps for @dete to answer: if the remaining portion of external mutability unaddressed by https://github.com/onflow/flow/pull/703 were to be fixed (perhaps via something like https://github.com/onflow/flow/pull/1056), would that assuage your security concerns about public resource fields? Would you then feel comfortable closing this FLIP?

Jul 25 '22 15:07 dsainati1

Maybe if we can somehow make this mutability / readonly reference things, what I would love to have: you can access everything to view of a resource ( including private ), but you cannot somehow copy fields ( probably shared as reference too ) or access mutable methods.

This way as everything on blockchain is possible to see, we could have that part on chain too. Then we can remove pub(set) fields from resources, even can make resources black box.

Jul 25 '22 15:07 bluesign

Closing after short chat with Dete

Sep 22 '22 17:09 j1010001

This FLIP was closed as there has not been any discussion for several month, and so far there is no majority / consensus for accepting it.

Sep 22 '22 17:09 turbolent

flow flow copied to clipboard

Add FLIP for removing public resource fields

flow
flow copied to clipboard