solidity icon indicating copy to clipboard operation
solidity copied to clipboard

Allow specifying storage locations

Open pipermerriam opened this issue 9 years ago • 96 comments

Inline assembly has now made fully up upgradable contracts possible. One of the main hangups with this is that the storage locations have to stay the same across upgrades. Would it be possible to introduce support for specifying the storage locations for storage variables?

pipermerriam avatar May 25 '16 05:05 pipermerriam

not so!

VoR0220 avatar May 25 '16 06:05 VoR0220

See Nick Johnson's Library on upgradeability :)

https://gist.github.com/Arachnid/4ca9da48d51e23e5cfe0f0e14dd6318f#file-upgradeable-sol

VoR0220 avatar May 25 '16 06:05 VoR0220

Especially with contract upgrades in mind, wouldn't it be better to copy the storage layout and "disable" unused state variables by e.g. prefixing them? Otherwise I don't see how you would practically verify that the storage layout is consistent between upgrades.

chriseth avatar May 25 '16 07:05 chriseth

Is there documentation on how storage layout is determined?

On Wed, May 25, 2016, 1:54 AM chriseth [email protected] wrote:

Especially with contract upgrades in mind, wouldn't it be better to copy the storage layout and "disable" unused state variables by e.g. prefixing them? Otherwise I don't see how you would practically verify that the storage layout is consistent between upgrades.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/ethereum/solidity/issues/597#issuecomment-221499635

pipermerriam avatar May 25 '16 12:05 pipermerriam

http://solidity.readthedocs.io/en/latest/miscellaneous.html#layout-of-state-variables-in-storage

chriseth avatar May 25 '16 14:05 chriseth

Ok, so after reading up on storage layouts...

contract MyContractV1 {
    uint a;
    bytse32 b;
}

In this example, a should be stored in slot 0 and b in slot 1.

Now, consider I upgrade it to the following.

contract MyContractV2 {
    int c;
    uint a;
    bytes32 b;
}

This would end up with c stored in slot 0, a in 1, and b in 2 which would break things.

So, instead, I propose being able to do the following.

contract MyContractV2 {
    int c;
    uint a @ 0x0;
    bytes32 b @ 0x1;
}

The solidity compiler would see that a and b are designated for storage slots 0 and 1 respectively, and would then place c at the next available location, slot 2.

Does that make sense? Is this possible?

pipermerriam avatar May 25 '16 15:05 pipermerriam

I was looking for a complementary/similar feature: the ability to disable packing. (i.e. currently if two storage parameters are each < 256 bits and together they fit into one slot, they are packed together.)

Ultimately the compiler could optimise the packing based on the frequency of changes to one ore more variables within.

With your suggestion this is a given, each marked variable gets its own slot. I would use a different markup though:

storage(0) int a;
storage(1) bytes32 b;

axic avatar May 25 '16 15:05 axic

I would use a different markup though

the @ was just the first thing that came to mind. I like storage(...) better.

pipermerriam avatar May 25 '16 15:05 pipermerriam

I think the tradeoff between introducing errors and decreasing readability is much better when just adding int c at the end. If you want, you can also use inheritance (let the upgraded contract inherit from the old contract).

chriseth avatar May 25 '16 16:05 chriseth

^ 👍 for the inheritance structure...it overall is cheaper and more cost effective to do it that way. I envision a lot of modularity around dapps in the future in regards to storage to better handle updates and save gas.

VoR0220 avatar May 25 '16 20:05 VoR0220

This came up again as a discussion with @federicobond and I think a good middle ground could be to have an annotation (as proposed in https://github.com/ethereum/solidity/issues/597#issuecomment-221611370), but instead of marking a storage slot, it would rather have a string literal as a key, which is hashed to produce a 256-bit key for storage.

This would be more expensive (due to the fact of using 32-byte long constants and one couldn't combine multiple variables into a single slot), but might be justified for some.

When this annotation is missing, it would default to the current behaviour.

For syntax I propose:

int256 a storage_key("this_is_my_variable");
bytes32 b storage_key("and_this_too");

axic avatar May 25 '18 16:05 axic

@axic

  1. I like the hashed key approach
  2. reasoning for not allowing specific slot to be specified? foot gun?

pipermerriam avatar May 25 '18 16:05 pipermerriam

I really don't think solidity should have such low-level impact on the storage location. If you want to dislocate storage variables, why not use structs or a mapping to structs?

chriseth avatar May 28 '18 20:05 chriseth

One more possible other solution:

contract MyContract {
  storage("some-collection") {
    uint foo;
    uint bar;
  }

  storage("other-collection") {
    mapping (uint => bool) qux;
    MyStruct baz;
  }
}

The advantage of this is that contracts could define blocks of variables that are colocated in storage, but providing gaps, to extend structs later, etc.

gnidan avatar Jul 20 '18 12:07 gnidan

Just throwing this as an idea: given that this need arises from avoiding clashes when working with upgradeability, wouldn't it make sense to just avoid clashing by storing all variables in a hashed location, similar to how a mapping works? We could either store all variables from the same contract/struct together (the hash being a contract identifier, and variables are stored at offsets of that hash), or all individual variables in sparse hashed locations.

The issue remains on how to generate an identifier for a contact, to ensure there are no clashes between different contracts, but that identifier is more robust than a simple name. Maybe requiring a special constant with a random value for every contract that will use this approach, similar to old Java's serialVersionUID?

spalladino avatar Sep 05 '18 16:09 spalladino

There was also a lengthy related discussion in #4017.

axic avatar Oct 29 '19 17:10 axic

This came up again with #7891 .

If we want to expose really general control we need three components:

  • storage slot
  • offset in the storage slot
  • number of bytes reserved

Natural restrictions would apply (violating would result in compile time errors):

  • offset + sizeOfType <= 32
  • numberOfBytesReserved >= sizeOfType
  • we could in a first version have (offset + numberOfBytesReserved) % 32 == 0 and only later decide whether to lift that
  • no overlap with a previously declared variable is possible

I would suggest to make all such specifiers optional. Variables without specifiers before any variables with specifiers will be assigned slots as before. For variables without specifiers after any variables with specifiers there are two options:

  • continue to put them after the last variable without specifier unless this is in conflict with another variable - if that's the case, move past it (I dislike this)
  • continue assigning storage locations after the last occupied storage location so far (including variables with specifiers) (I prefer this)

For the purpose of inheritance: locations are assigned just as if it was one flat contract containing all variables in the order of C3 linearization.

Example (we can always decide on a different syntax):

contract A {
  uint256 a; // will occuply full slot 0
  // slots 1 and 2 will remain unused
  storage{slot: 3, offset: 0, reserved: 32} bool b; // will occupy full slot 3

  storage{slot: 4, offset: 1} bool c; // will occupy the second byte in slot 4
  storage{slot: 4, offset: 0} bool d; // will occupy the first byte in slot 4
  storage{slot: 4, offset: 16} uint128 d; // will occupy the second half of slot 4

  uint128 e; // will occupy the first half of slot 5

  storage{slot: 5, offset: 16} uint128 f; // will occupy the second half of slot 5

  storage{slot: 6, offset: 0} bool g; // will occupy first byte in slot 6
  bool h; // will occupy second byte in slot 6
  storage{slot: 6, offset: 2, reserved: 2} bool i; // will occupy third byte in slot 6
  bool j; // will occupy fifth byte in slot 6
  storage{slot: 6, offset: 16, reserved: 48} uint128 k; // will occupy second half of slot 6
  // slot 7 will remain unused
  uint128 l; // will use the first half of slot 8
}

An alternative notation-wise would be to merge slot and offset into a single byte offset that is then split into slot = byteOffset/32 and offset = byteOffset%32 (to which the same restrictions would apply). A copy of the example above using this notation:

contract A {
  uint256 a; // will occuply full slot 0
  // slots 1 and 2 will remain unused
  storage{offset: 96, reserved: 32} bool b; // will occupy full slot 3

  storage{offset: 129} bool c; // will occupy the second byte in slot 4
  storage{offset: 128} bool d; // will occupy the first byte in slot 4
  storage{offset: 144} uint128 d; // will occupy the second half of slot 4

  uint128 e; // will occupy the first half of slot 5

  storage{offset: 160} uint128 f; // will occupy the second half of slot 5

  storage{offset: 192} bool g; // will occupy first byte in slot 6
  bool h; // will occupy second byte in slot 6
  storage{offset: 194, reserved: 2} bool i; // will occupy third byte in slot 6
  bool j; // will occupy fifth byte in slot 6
  storage{offset: 208, reserved: 48} uint128 k; // will occupy second half of slot 6
  // slot 7 will remain unused
  uint128 l; // will use the first half of slot 8
}

ekpyron avatar Jan 14 '20 10:01 ekpyron

Another alternative would be to require specifying the location for all variables, if the location is specified for any variable.

Also we could at a later point allow compile time evaluated expressions in the specifier, i.e.:

storage{slot: keccak256("some_key")} uint256 some_key;

Although we'd need to consider that one could construct those to specifically collide with some mapping key, so this would be dangerous.

Although that's also true for choosing some specific value for slot: that happens to be the location of some mapping element.

ekpyron avatar Jan 14 '20 10:01 ekpyron

Maybe we should gather some data about how this feature would be used. One use is avoiding clashes during upgrades, another is having more efficient use of storage by combining small variables in a certain way. I think just providing full flexibility all the time might not be the way to go as it is too easy to get wrong. So it could already be enough to only allow hashed locations and another way to specify which variables to combine (without specifying the offset exactly) or when to insert "start a new slot here".

chriseth avatar Jan 14 '20 14:01 chriseth

What can "go wrong"? Or in particular, what can go wrong that we can't easily detect at compile time? I'd argue that it makes more sense to provide a general solution and, if deemed necessary, restrict it to simple cases (as in restrict to some particular kinds of values for slot, etc. - e.g. restricting to only supporting "start a new slot here" would be to require slot to be the "current slot" plus one and require offset to be zero).

That way we can always extend the very same solution to support more cases, instead of needing breaking changes and new language features...

ekpyron avatar Jan 14 '20 14:01 ekpyron

One use is avoiding clashes during upgrades

For the sake of upgrades, it'd seem that the only requirement is to be able to assign an immutable id to a variable, which should be deterministically mapped to a slot (like the storage{slot: keccak256("some_key")} proposed above). It's not really important where in the storage the variable is kept.

As for EIP2330 linked above, the requirements are pretty much the same. As long as there is a deterministic process for calculating the storage slot, the actual slot can then be just exposed in the ABI for any consumers.

spalladino avatar Jan 24 '20 13:01 spalladino

To implement contract following EIP 1822 and/or EIP 1967, the capability to define a specific slot would be required. Right now, this needs to be done via inline assembly...

KaiRo-at avatar Apr 29 '20 17:04 KaiRo-at

After reading this issue, and thinking about the issue, this is my proposed solution.

  1. add the keyword deterministic
  2. it can only be used with declared storage variables
  3. at compile time the slot used for uint256 deterministic myNumber; is keccak256('myNumber')
  4. basically, similar to immutable which doesn't use storage, deterministic variables will be removed when calculating the sequential slots of each variable, the keccak256 is calculated at compile time and used for all instances.
  5. upgradable contracts can now rely on "this variable name will always be a specific storage slot"
  6. If it sees the same variable name in multiple inherited contracts with mixed deterministic states it should throw an error at compile time saying "can't use deterministically declared variable non-deterministically"

It seems simple enough to remove complexity but accomplish some of the major goals of this thread.

However, it seems like this thread has grown with a list of reasons and use cases which can only be satisfied by increasingly complex low level access which is difficult to implement without adding a foot cannon.

Edit: Thinking about it more, it might be sufficient to commit only to the variable name, as someone using this feature would probably make their variable names more descriptive address deterministic openZepplinProxyImplementation; which should probably throw an error if an inherited contract tries to use uint256 deterministic openZepplinProxyImplementation; which could be another foot cannon.

Edit2: I have a low-deploy-gas-cost proxy contract that I optimized the bytecode for, and it would be great to use single byte storage slots (ie. 0xff) without needing to create 255 dummies... I just tried upping my storage slot for the proxy to PUSH32 with a random hash instead of the 0 I'm using right now, it bumped my deploy cost from 80k to 100k (since I have 1 PUSH in the deploy code and 2 PUSHes of it in the contract code.)... so I would definitely also enjoy the ability to specify an arbitrary value for the slot as well... that said, my use case is extremely niche so I understand not accommodating it.

junderw avatar Jul 23 '20 22:07 junderw

Copying the suggestion from @dominicletz from #7593:

A new keyword fixed(@N) is proposed that can be used to define fixed slot position in interfaces.

interface ContractAddressMap {
   public fixed(@5) mapping(bytes32=>address) addr;
}

axic avatar Jul 29 '20 21:07 axic

Here's my spin on specifying storage slots, based on existing EIPs that could benefit from this:

  • Allow specifying the location as a (compile-time calculable) constant uint256/bytes32 (see examples)
  • Syntax-wise, appending at (<slot>) (or at(<slot>, <offset>)) seems more intuitive and readable than using e.g. @
    • Offset would be optional and default to 0. Optionally, this might also allow the syntactic sugar at <slot>
    • The exact syntax doesn't matter too much and the previously suggested storage_key("key") works too
    • The fixed(@N) syntax from #7593 could work too, if N can be more than just an integer
      • The EXTSLOAD opcode from their EIP on the EVM level doesn't care how Solidity compiles their fixed(@N) syntax
      • Their Solidity suggestion would actually benefit from allowing keys and byte32s over just integers
    • The only change I'd add is that instead of accepting (only) strings that'll be hashed, it also accepts uint256/bytes32s. This is to support e.g. EIP-1967 where we substract 1 after hashing the key
    • In my examples, I'll only be using uint256/bytes32s, but again, accepting to-be-hashed strings on top of that seems fine
  • Exclude "location-specified" storage fields from the sequential storage slot assignment logic (see someVarN in 1st example)
    • Basically, the storage slots for non-specified storage fields should be unaffected by the presence of specified storage fields
    • This also means that existing contracts should compile to the exact same bytecode as before this feature gets enabled
  • We can make use of the fact that locations are known at compile time to e.g. warn for using the same slots, overlap, ...
  • Perhaps we could even allow aliasing storage fields, e.g. referencing(<state variable>) (see NewFacet.sol in 2nd example)

This is partly with my (although limited) experience working with Solidity and proxy contracts, the feedback in this issue and the mentioned EIPs. It's both an attempt at covering as much of the use cases and requirements, while also seeing if there's any (planned) progress on this issue. I've noticed now that I'm rereading it that this is quite a big comment, hopefully that isn't too much of an issue.

Example 1 (simple proxy)

Example showing how it behaves on its own and how it interacts with non-specified slots:

contract SomeProxy {
    uint256 private someVar1; // storage slot 0x0 (in this case starting at 0x0 because there are no inherited fields)
    uint256 private someVar2 = 5; // storage slot 0x1

    uint256 public something1 at(0xAA112233) = 99; // storage slot 0xAA112233

    // This would produce an error/warning because it uses the same storage slot as `something1`
    uint256 public something2 at(0xAA112233) = 123; // storage slot 0xAA112233

    uint256 public someVar3; // storage slot 0x3 (right behind `someVar2`, thus ignoring `something1` and `something2`)

    bytes32 constant IMPLEMENTATION_SLOT = bytes32(uint256(keccak256('eip1967.proxy.implementation')) - 1);
    // ^ 0x360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc 

    // Not specifying the offset, so can use the syntactic sugar version without the brackets
    address private _implementation at IMPLEMENTATION_SLOT; // storage slot 0x360894...

    // ....
}

the _implementation part is based on the storage slots specified in EIP-1967

Example 2 (diamond storage)

This could also work in interfaces, which helps a lot with diamond storage (EIP-2535), related to #9551. In this example:

  • OldFacet.sol specifies a standalone diamond facet with a storage struct at keccak256('diamond.storage.OldFacet')
    • It doesn't use DiamondStorages from Storage.sol for whatever reason. Might be an extern facet, or developed earlier
  • Storage.sol specifies an interface DiamondStorages with all known diamond storages, at their proper storage slots
  • NewFacet.sol specifies a new diamond facet that also makes use of OldFacet's storage by extending DiamondStorages
    • The contract inherits the field with the specified storage slots, without having to redeclare them
    • While the DiamondStorages needs to define the storages your new facet uses, it could instead define all storages part of the diamond. In that case, should "overlapping storage slots" warnings be enabled, your diamond gets checked for this issue
// OldFacet.sol
bytes32 constant OLD_FACET_STORAGE_SLOT = keccak256("diamond.storage.OldFacet");

struct OldFacetStorage {
    // ...
}

contract OldFacet {
    // Directly registers the storage field/slot here, e.g. an externally developed or pre-Storage.sol contract
    OldFacetStorage storage at OLD_FACET_STORAGE_SLOT;
    // ....
}


// Storage.sol
import "./OldFacet.sol";
import "./NewFacet.sol";

interface DiamondStorages {
    OldFacetStorage oldFacetStorage at OLD_FACET_STORAGE_SLOT;
    NewFacetStorage newFacetStorage at NEW_FACET_STORAGE_SLOT;
    // ....
}


// NewFacet.sol
import "./DiamondStorages";

bytes32 constant NEW_FACET_STORAGE_SLOT = keccak256("diamond.storage.NewFacet");

struct NewFacetStorage {
    // ...
}

// Makes use of Storage.sol instead of specifying the storage field/slot for every (known/used) faucet's storage
contract NewFacet is DiamondStorages {
    // Optional "aliasing storage fields" which can be handy for shortening long names to shorter ones
    NewFacetStorage private s referencing(newFacetStorage);

    function banana() public returns (uint256) {
        // Using OldFacet's storage as declared in `DiamondStorages` in `Storage.sol`
        // Using NewFacet's storage as declared by `s`, using the same slot as `newFacetStorage` in `DiamondStorages`
        return oldFacetStorage.apple + s.durian;
    }
}

Since all the used/known diamond storages are all defined and known at compile-time, the compiler could (as part of later feature requests) check whether any storage slots overlap each other.

Example for #7593

Example of how this would benefit #7593 should non-integer values for N be allowed:

interface ContractAddressMap {
    public mapping(bytes32=>address) addr at keccak256('ENS.addr');
    // or different versions of their syntax that allows non-integer `N` values:
    public fixed(@keccak256('ENS.addr')) mapping(bytes32=>address) addr; // keep `@` to differentiate from e.g. `map_v1`
    public fixed(keccak256('ENS.addr')) mapping(bytes32=>address) addr; // get rid of `@`, force 1st argument to be location
    public fixed('ENS.addr') mapping(bytes32=>address) addr; // assuming using a direct string implies `keccak256`
}

contract Resolver is ContractAddressMap {
    // ...
}

Mind that this also more or less solves their issue regarding conflicts, as compared to using fixed(@5), it's a lot harder to have clashes when using e.g. keccak256('ENS.addr'). Suggesting these kind of modifications to the EIP-2330 draft seems doable and beneficial for everyone.

Expected improvement from #3157

Should #3157 be implemented, this can help a lot with readability with complex keys later on, e.g. for our first example:


function proxy_slot(string key) public pure returns(bytes32) {
    return bytes32(uint256(keccak256(abi.encodePacked('eip1967.proxy.'), key)) - 1);
}

contract SomeProxy {
    // Points to the slot at `bytes32(uint256(keccak256('eip1967.proxy.implementation')) - 1)`
    // aka `0x360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc`
    address private _implementation at proxy_slot("implementation");
}

Deterministic storage slots

This feature request is purely to use statically assigned storage slots, as specified by those EIPs or used in the diamond (storage) pattern. The whole "deterministic storage slot" idea is also interesting, but more of a separate feature from (or on top of) specifying specific storage locations. It's worth mentioning that the latter could still be relatively easily done should (the static part of) this feature be implemented:

contract DeterministicFields {
    address private _owner at keccak256('_owner');

    // or again, if we have #3157 and support pure functions in the contract itself during compile-time:
    function storage_slot(string key) internal pure returns(bytes32) {
        return bytes32(uint256(keccak256(abi.encodePacked('DeterministicFields.'), key)) - 1);
    }
    address private _admin at storage_slot('_admin');
    // ^ Would use storage slot `keccak256('DeterministicFields._admin')`, making it less likely to clash
    // with storage slots from other contracts using an `_admin` field (e.g. in base contracts or other facets in a diamond)
}

SchoofsKelvin avatar Jul 10 '21 22:07 SchoofsKelvin

In https://github.com/ethereum/solidity/issues/8353#issuecomment-932474378 I have described another solution, similar to what @gnidan and @spalladino suggested above (see https://github.com/ethereum/solidity/issues/597#issuecomment-406582568, https://github.com/ethereum/solidity/issues/597#issuecomment-418798622).

In short, you can already use a mapping of structs to give your contract a set of variables at a storage location deterministically computed from some key:

contract MyContract {
    struct State {
        uint variable1;
        address variable2;
    }

    mapping (string => State) states;

    function f() public returns (uint) {
        return states["MyContract"].variable1;
    }
}

We could just add a bit of syntax sugar on top of that and have a simple solution that might be good enough in most cases. I.e. this would generate bytecode identical to the above:

contract mapping MyContract {
    uint variable1;
    address variable2;

    function f() public returns (uint) {
        return variable1;
    }
}

Since it would be a new contract variety, it would be fully backwards-compatible.

cameel avatar Oct 01 '21 19:10 cameel

This issue has been marked as stale due to inactivity for the last 90 days. It will be automatically closed in 7 days.

github-actions[bot] avatar Feb 12 '23 12:02 github-actions[bot]

We have multiple issues along this (see e.g. the list in https://github.com/ethereum/solidity/issues/13466#issuecomment-1419319182) - I'm marking this one here unstale and as must have eventually, since it's the most generic and most likely version to be implemented eventually.

ekpyron avatar Feb 13 '23 20:02 ekpyron

This issue has become more critical with the growing popularity of account abstraction, so I would like to resume this discussion.

Account abstraction adds a new challenge that, to the best of my knowledge, no other contract has faced before.

The user owns the account, typically a proxy, and may occasionally wish to switch to a different implementation as the needs change. E.g. switch between Safe and Argent. Unlike other proxy situations where new implementations use a compatible storage layout, or at least the new implementation is aware of the old one and can perform a conversion, now we're dealing with implementations that are not aware of each other.

Currently, most account implementations start at storage slot zero. When switching implementations, the user may inadvertently end up with things like shadow signers, e.g. this Safe which seems to have one valid signer, but in fact has 20 invisible signers who can actually transact in the Safe.

There is no safe way for users to migrate between account implementations, unless each account uses a different storage base. I would like to propose a pragma for doing so:

pragma storage_base <value>;

When present, the compiler calculates keccak(value) and uses it as a hardcoded offset for storage access. A typical value would be <proj_name>.<layout_version>.

The pragma only sets the offset of next items, and following items are incremented by one, as usual. Therefore if the pragma appears multiple times, each pragma forms a "logical group" of member fields.

A buggy/malicious account would still be able to bypass it and access other storage offsets, but it'll require asm code and therefore attract auditors' attention. Pure solidity code will only access storage determined by the storage_base value.

yoavw avatar May 07 '23 20:05 yoavw

@yoavw Typically what people do to get this is to add a state variable uint[1000] junk.

// Inherit this for leaving out the first 100 slots
abstract contract storageBase {
    uint[100] junk;
}

Would that be a reasonable way to achieve the same?

hrkrshnn avatar May 10 '23 17:05 hrkrshnn