solidity icon indicating copy to clipboard operation
solidity copied to clipboard

Unchecked array access

Open nventuro opened this issue 5 years ago • 28 comments

Solidity currently performs bounds checks on array access, asserting that the index is smaller than the lenght of the array. In some scenarios, this check can be skipped by the developer to save gas, given additional guarantees on the range of the indices.

For example, consider the following snippet where a slice of an array is processed:

require(start + size <= array.length);
for (uint i = 0; i < size; ++i) {
  foo(array[start + i]);
}

Discussion around having checked arithmetic, with opt-in unchecked areas (https://github.com/ethereum/solidity/issues/9054) brought up the idea of using these same areas to have unchecked array access. Given this, the previous snippet might be written as follows:

require(start + size <= array.length);
  for (uint i = 0; i < size; ++i) {
    unchecked {
      foo(array[start + i]);
    }
}

A potential issue with this approach is that the meaning of unchecked would be extended to more than just arithmetic checks, increasing information burden on the users. Some people have proposed giving arguments to unchecked, to signal which checks are being turned off (e.g. unckecked('array-access').

I believe a plain unchecked is good enough for all of these cases, since we'll want to make unchecked areas as small as possible: there shouldn't be more than one 'checked' operation inside them.

nventuro avatar Jun 03 '20 22:06 nventuro

I have a bad feeling allowing this. Unchecked overflow is a very different thing than unchecked array access. My hope is that the new optimizer will find most of these cases and remove the check.

chriseth avatar Jun 15 '20 12:06 chriseth

Agree with @chriseth I think a much better solution in this direction would be range loops

leonardoalt avatar Jun 29 '20 12:06 leonardoalt

Indeed, @axic mentioned those as an alternative here as well.

nventuro avatar Jun 29 '20 13:06 nventuro

What is the status of this? I don't like to pay gas for array bounds checking when it isn't needed.

mudgen avatar Oct 24 '20 21:10 mudgen

What's the recommended way to do unchecked access in 0.7?

nventuro avatar Oct 25 '20 14:10 nventuro

I'm moving my comment here from the duplicate issue I made:

When looping through an array using the array length as the max bound, it is unnecessary to pay the gas for array bounds checking.

For example:

uint sum;
for (uint i; i < myarray.length; i++) {
  sum += myarray[i]
}

Is it possible for the optimizer to optimize away the bounds access? A manual way to turn it off? New range loop? What's the plan and timeline for this now?

In loops with many iterations, or many array index accesses, or nested loops this might really matter. I don't like paying gas for unneeded checking, which is most of the time I'm using arrays.

mudgen avatar Oct 25 '20 14:10 mudgen

As far as I know there is no plan/timeline for this.

@mudgen your example is basically the same as the initial one in the issue, so yes we're all aware of that case.

I still think unchecked array access/manual way to turn it off is a really bad idea. Even if the optimizer can detect some cases, the general problem is still undecidable. In my opinion the only proper solution is range loops.

leonardoalt avatar Oct 25 '20 15:10 leonardoalt

In loops with many iterations, or many array index accesses, or nested loops this might really matter. I don't like paying gas for unneeded checking, which is most of the time I'm using arrays.

Loops with many iterations is a discouraged pattern in the first place. Reason: it very well could be possible that such a piece of code will lock up due to future gas changes.

I think a much better solution in this direction would be range loops

Indeed, @axic mentioned those as an alternative here as well.

Btw, here's a potential syntax for range based loops:

uint[] array;

for (uint value: array) {}

// And with slicing
for (uint value: array[start:end]) {}

Of course for value types this doesn't support references, but references/pointers is probably something we should solve separately in general.

axic avatar Oct 25 '20 15:10 axic

@axic I think that is a very nice syntax for range loops. It would also be nice and useful if there is some way to get the iteration index of each iteration. I suppose it can be done manually be adding a variable and incrementing.

mudgen avatar Oct 25 '20 15:10 mudgen

Is there a simple way available today to do an unchecked read? Even if it involves an inline-assembly sload. Short loops (n < 10) are not uncommon, and sload is one of the most expensive operations in the EVM - the added gas overhead is in most cases unacceptable.

nventuro avatar Oct 25 '20 23:10 nventuro

If you know the loop is short and want to optimize sloads radically, what about a fixed-size array?

chriseth avatar Oct 26 '20 15:10 chriseth

Sadly they are not fixed-size but dynamic, forcing static arrays with some upper-bound imposes restrictions and extra costs that are also undesirable.

Besides, that's not really related to the point being made here about many code patterns leading to scenarios where the indexes are trivially known to be below the length, making these extra reads and checks wasteful.

nventuro avatar Oct 26 '20 18:10 nventuro

scenarios where the indexes are trivially known to be below the length

Can you write more about these scenarios? (other than examples with require.) Do you think the compiler will be able to detect these cases?

hrkrshnn avatar Oct 26 '20 19:10 hrkrshnn

@nventuro I highly disagree that these are trivial. If we're talking about only and exactly your example, sure, there's an AST pattern there. If anything changes, for example, i is incremented differently, there are branches inside the loop, the condition is a bit different, or even the same thing is simply written differently, the AST is different and you have to solve the general case. That can very easily become a very hard problem, since you're asking the compiler to automatically prove the loop invariant start + i < array.length.

leonardoalt avatar Oct 26 '20 19:10 leonardoalt

In my code if there is a chance that an array index could be out of bounds I will want to make a require statement to check for that and provide a helpful error message explaining the contextual specifics of the error. This makes the built-in array bounds checking redundant and wastes gas. Hence I would like to be able to turn it off.

mudgen avatar Oct 26 '20 20:10 mudgen

Is there a simple way available today to do an unchecked read? Even if it involves an inline-assembly sload. Short loops (n < 10) are not uncommon, and sload is one of the most expensive operations in the EVM - the added gas overhead is in most cases unacceptable.

@nventuro with EIP-2929 subsequent SLOADs won't be as expensive. This EIP is being discussed for the next hard fork.

axic avatar Oct 26 '20 20:10 axic

@mudgen so you'd write your own requires all over a sort implementation, for example? Sounds unlikely to me. Or would you just trust your code, and assume all accesses are safe? Sounds quite unsafe to me. Or would you formally verify your sorting function? Sounds not very doable to me, at least automatically.

leonardoalt avatar Oct 26 '20 20:10 leonardoalt

@leonardoalt I don't know. I'd have to look at the specific implementation. If I'm looping over an array using its length as the max bound then I don't need bound checking.

I'd like to be able to use automatic bound checking where it makes sense and not use it where it doesn't make sense.

mudgen avatar Oct 26 '20 20:10 mudgen

So you want an entirely new completely unsafe feature for this one very specific use case.

leonardoalt avatar Oct 26 '20 20:10 leonardoalt

It is easy to use safely and it has wide use. For example looping over an array using its length for the max bound is very common.

mudgen avatar Oct 26 '20 20:10 mudgen

Now we're just going in loops, I already voice my opinion.

leonardoalt avatar Oct 26 '20 21:10 leonardoalt

the AST is different and you have to solve the general case. That can very easily become a very hard problem, since you're asking the compiler to automatically prove the loop invariant

To be clear, I'm not asking the compiler to do this for me. I understand the general problem is very hard (and can be made harder if the contents of the loop can potentially alter the array in question), and even a great solution would not cover all cases.

What I'm saying is, there's concrete scenarios where I as a developer know unchecked access is fine, and would like to be able to opt-in to avoid performing work that is (based on my analysis) wasteful.

with EIP-2929 subsequent SLOADs won't be as expensive

I was not aware of the 'warm' part of that EIP, thanks!

nventuro avatar Oct 27 '20 01:10 nventuro

Are there any plans to do range-based loops in 0.8? I'm able to replace checked storage array access by mimicking arrays with index -> value mappings plus a length field, but for memory arrays that doesn't work as well.

As a crutch, it'd be great to otherwise have some form of unsafe access, leaving it up to the developer. I've managed to reduce gas costs for our most sensitive use case by 20% just with unchecked storage access.

nventuro avatar Oct 30 '20 14:10 nventuro

@nventuro it is not planned atm. Also created a new issue to discuss range based loop specifically (#10162), given this issue is a different discussion.

axic avatar Oct 30 '20 17:10 axic

Also there's this comment from @chriseth on the previous discussion https://github.com/ethereum/solidity/issues/9054#issuecomment-641194378:

@nventuro are you talking about the check in ++i? In the new code generator, the compiler should be able to get rid of it and for the old generator, if it is not possible to optimize that, then the following should work:

for (uint256 i = 0; i < array.length; unchecked { ++i } ) {
  ...
}

axic avatar Oct 30 '20 17:10 axic

What is the current thinking on this feature? Array-heavy code could definitely benefit from it.

frangio avatar Jun 30 '22 15:06 frangio

I'd like to add that it is not just about for loops. Some other pieces of code would benefit from that:

In the case of this, checking the bound would cost one sload per iteration which is 100gas each time.

Amxx avatar Jun 30 '22 15:06 Amxx

It might be obvious but just to mention, unchecked array access can be deadly if user inputs reach the array index, basically attacker could have the contract read a value from any slot they want and contract would process it. But yeah as long as only dev gets to input the index, it'd be great if the language provides "unsafe" methods. This would also hint devs that they should be careful and prevent user from manipulating the array index.

But for those who are gas golfing/need this right now, they can anytime drop to assembly and do stuff. For e.g.

contract MyContract {
    using Uint256Array for uint[];
    using Uint256Array for uint;

    uint[] public myArray;

    function sum() external  view returns (uint result){
        uint len = myArray.length;
        uint arrayPointer = myArray.pointer();
        for(uint i; i < len; i++) {
            result += arrayPointer.unsafeAccess(i);
        }
    }
}

library Uint256Array {
    function pointer(uint[] storage arr) internal view returns (uint256 result) {
        assembly {
            mstore(0, arr.slot)
            result := keccak256(0, 0x20)
        }
    }

    function unsafeAccess(uint p, uint index) internal view returns (uint result) {
        assembly {
            result := sload(add(p, index))
        }
    }
}

zemse avatar Jul 28 '22 11:07 zemse

This issue has been marked as stale due to inactivity for the last 90 days. It will be automatically closed in 7 days.

github-actions[bot] avatar Mar 06 '23 12:03 github-actions[bot]

Hi everyone! This issue has been automatically closed due to inactivity. If you think this issue is still relevant in the latest Solidity version and you have something to contribute, feel free to reopen. However, unless the issue is a concrete proposal that can be implemented, we recommend starting a language discussion on the forum instead.

github-actions[bot] avatar Mar 14 '23 12:03 github-actions[bot]