solidity icon indicating copy to clipboard operation
solidity copied to clipboard

User-defined literal suffixes.

Open chriseth opened this issue 2 years ago • 6 comments

(EDIT by @cameel)

Design

Literal suffixes - design decisions

TODO

Here's a summary of what's left to do here. Overall not much concrete stuff. Just tons of small things.

Concrete things to do

  • [ ] Fix failing command line tests.
  • [ ] Disallow suffixes with zero or more than one return value.
  • [ ] Disallow suffixes with storage and calldata return values.
  • [ ] Disallow signed exponent for decimal suffixes.
  • [x] Disallow non-pure functions as suffixes.
    • [ ] ~Or fix: ViewPureChecker does not complain about mutable/view suffixes used inside view/pure functions.~
  • [ ] Suffixed literals should be marked as function calls in the AST.
    • Add an annotation that makes an AST Expression node as a semantic function call (i.e. generalize the one we added for operators in #12362)
  • [ ] Allow suffix overloading.
  • [ ] Fix: ICE when an integer or decimal with zero fractional part is decomposed into mantissa and exponent.
  • [ ] Fix: Suffixed values are not disallowed as array sizes even though they're not constants.
  • [ ] Fix: Suffixes not detected as function calls by control flow graph (no "unreachable code" warning).
  • [ ] Include suffix definition as a secondary location in errors.
  • [ ] Grammar update.
  • [ ] Documentation + changelog.

Things to check and maybe fix/implement

There are some things that might be already partially or fully working:

  • [ ] Research syntax variants for marking function declaration usable as a suffix.
  • [x] Make sure types other than uint in suffixes for rational literals are allowed (both single arg and with mantissa/exponent).
  • [x] Make sure our analysis (in particular the function call graph and control flow graph) follows suffixes like other function calls. Where possible, add test coverage.
  • [ ] There are a lot of TODOs in the code that need to be reviewed

Extra test cases

  • Syntax tests
    • [x] Suffixing negative values.
    • [x] Suffix overloading.
    • [x] Range of mantissa and exponent, especially with suffixes accepting integer types other than uint.
    • [x] Correct parsing of suffixes in ternary operator expressions without parentheses.
    • Invalid cases
      • [x] Suffixed literals are not constants (should not be usable in constant definitions, array declarations, etc.).
      • [x] Suffixing array literals, tuple literals, structs, enums.
      • [x] Suffixing errors and events.
      • [x] Suffix returning a struct containing a mapping.
  • Semantic tests
    • [ ] Correct return values for all kinds of literals (integers, booleans, decimals, addresses, strings, bytes).
    • [ ] Suffix use in various places (different types of calls, new, indexing, initialization, modifiers, abi.encode, etc.)
    • [ ] Overloaded suffixes.
    • [ ] Values at the edges of supported ranges.
    • [ ] Suffixes shadowing other things.
    • [ ] Suffixes called szabo and finney (deprecated denominations).
    • [ ] Arbitrary return types.
    • [ ] Calling the function pointer returned by a suffix.

chriseth avatar Feb 11 '22 19:02 chriseth

Is this waiting for reviews or code changes?

leonardoalt avatar Apr 04 '22 10:04 leonardoalt

Both.

chriseth avatar Apr 11 '22 14:04 chriseth

One thing to note about the 1 f notation (i.e. without _ between the literal and the suffix) is that we'd have to either add finney and szabo back to the parser or they'd be now allowed as custom literal suffixes. Someone could define them though not with the same exact semantics (i.e. no arbitrary-precision arithmetic).

cameel avatar Jun 02 '22 16:06 cameel

I wrote some example code without _ and it looked really weird, so I would say we should require the _ both for defining the function and for using it.

chriseth avatar Jun 07 '22 11:06 chriseth

@chriseth Actually we need to talk about syntax again because there is a problem with _. We use it as a number separator and this makes _ suffixes ambiguous with hexadecimals. For example:

    0x1_000_fff

Is this the number 0x1000fff or 0x1000 with a suffix called fff?

cameel avatar Jun 07 '22 11:06 cameel

EDIT: I moved the TODO list to PR description.

cameel avatar Jul 12 '22 15:07 cameel

user-defined “post-fix operators” vs. user-defined literal suffixes: https://github.com/ethereum/solidity/pull/12362#discussion_r953938583

aarlt avatar Sep 07 '22 13:09 aarlt

I really like using using to tie functions to literal suffixes. In my opinion, the following problems still need to be ironed out:

  1. the literal suffixes have to be usable for fractional literals. Those do not have a type. We added a mechanism to call the function with to arguments. A potential solution could be to either only use the using statement to declare a function being a literal suffix without specifying the type (usign f as suffix;) or have some special words to denote the different kind of literals (using f as string suffix; using f as int suffix; using f as fractional suffix;)
  2. Suffixes usually have short names and thus easily clash with other identifiers. Because of that (and also to avoid confusion), it might be useful to disallow referencing (and thus calling) such a function in a regular context, i.e. disallow m(1), but only allow 1 m. In this example, just writing m would result in an "identifier not found" error. The added complication here is that if the effect of using f as int suffix; is that f is not available as stand-alone identifier, how does the reference to f work inside the using statement? Maybe we can make an exception to that rule for using statements....

chriseth avatar Sep 07 '22 14:09 chriseth

I really like using using to tie functions to literal suffixes. In my opinion, the following problems still need to be ironed out:

  1. the literal suffixes have to be usable for fractional literals. Those do not have a type. We added a mechanism to call the function with to arguments. A potential solution could be to either only use the using statement to declare a function being a literal suffix without specifying the type (usign f as suffix;) or have some special words to denote the different kind of literals (using f as string suffix; using f as int suffix; using f as fractional suffix;)

I like the variant with fractional suffix.

  1. Suffixes usually have short names and thus easily clash with other identifiers. Because of that (and also to avoid confusion), it might be useful to disallow referencing (and thus calling) such a function in a regular context, i.e. disallow m(1), but only allow 1 m. In this example, just writing m would result in an "identifier not found" error. The added complication here is that if the effect of using f as int suffix; is that f is not available as stand-alone identifier, how does the reference to f work inside the using statement? Maybe we can make an exception to that rule for using statements....

So you would like to support functions and literal suffixes with the same name? e.g. m(1) is allowed, if there was a function e.g. m(uint256) defined. Where suffixes are only allowed to be used as suffixes 1 m? m(1) will be always forbidden, if only defined as user-defined suffix (no matching function e.g. m(uint256)).

However, not sure what using restrictions actually apply here. I don't fully understand the restriction with f is not available as stand-alone identifier. I think that this should at least work, when f was defined as a file-level function, right?

aarlt avatar Sep 07 '22 16:09 aarlt

We always have to think about this as being used across files. One file defined the suffix, another file uses it. The using file should explicitly import it. If we just say import m from "x.sol";, then we should be able to reference m by just writing m. But if we allow that, it blocks any other identifier called m - this is what we want to avoid.

chriseth avatar Sep 08 '22 13:09 chriseth

This is now rebased on top of the operators PR (#13790).

cameel avatar Dec 09 '22 14:12 cameel

@aarlt I see that thanks to your refactor I can finally see the errors from the failing AST import tests here. Before it used to just exit silently after printing some dots.

cameel avatar Dec 09 '22 15:12 cameel

I am still trying to understand how I managed to push the commit to the wrong branch. Sorry about that. :sweat_smile:

matheusaaguiar avatar Feb 24 '23 22:02 matheusaaguiar

Don't worry, but please restore this to the original state :)

I see you reverted the top commit - this does not remove it. You need to move the branch label locally and then push it so that it does not include your commits.

cameel avatar Feb 24 '23 22:02 cameel

Should be restored to the previous state now.

matheusaaguiar avatar Feb 24 '23 22:02 matheusaaguiar

Still has one commit from your branch on top.

cameel avatar Feb 24 '23 22:02 cameel

Removed the remaining commit.

matheusaaguiar avatar Feb 24 '23 23:02 matheusaaguiar

For future reference, if needed: I have been trying to isolate the problematic code in the external tests. I chose ens because it is the faster to fail. First, I tried to check if our shell script could print the name of files being compiled, but that is not possible directly. I am not sure if hardhat can do that and then instead I just tried erasing the whole contracts folder (created in a temp location, like /tmp/ext-test-ENS-.../, by our script) and starting with a minimal set of files that could compile without errors. Next, I started adding folders until I got to two interdependent ones which generate a compiler exception: wrapper and ethregistrar. These have around 30 files, to which I was going to apply the same methodology in the hopes of finding which is the problematic one and then finally be able to isolate the offending block of code. These files reference other projects, so I am not sure if the problem is really here. Maybe there's a better, easier way to do this, but atm that's what I can do straightforwardly without having to detour and dive deeper into the tools like hardhat.

matheusaaguiar avatar Mar 24 '23 15:03 matheusaaguiar

@matheusaaguiar Surprisingly, removing the codegen refactor did not make the assert failure in external tests go away (unless it's just a different assert failing now). So we still need to get to the bottom of it after all.

cameel avatar Apr 05 '23 12:04 cameel

Also, a faster way to isolate it might be to try passing individual top-level files to solc. As long as you use the --include-path option to point at the dependencies, files should generally compile and you should be able to isolate the one that is failing. Then find the spot by cutting out imports.

cameel avatar Apr 05 '23 12:04 cameel

I managed to identify what file and what code causes an assertion to fail, in the ens external test. The file is contracts/wrapper/NameWrapper.sol and the problem seems to be with function _canTransfer at line 607 which overrides a virtual function from another contract. The file is inside the tmp folder generated by our test script. Running the following command from the contracts folder, reproduces the problem: solc --base-path . --include-path ../node_modules/ ./wrapper/NameWrapper.sol --optimize --via-ir

It makes this assertion fail: https://github.com/ethereum/solidity/blob/c6cc2e4d8198b7b1c0d8a35824eb34c4aefb56bf/libsolidity/codegen/ir/IRGeneratorForStatements.cpp#L1073

matheusaaguiar avatar Apr 06 '23 05:04 matheusaaguiar

@matheusaaguiar Thanks, that was enough to track it down!

Turns out that the problem is caused by a virtual function that gets overridden with another function that's not exactly identical (view rather than mutable). I've got a repro:

abstract contract A {
    function f() internal {
        v();
    }

    function v() internal virtual;
}

contract C is A {
    function test() public {
        f();
    }

    function v() internal view override {}
}

My assumption was that the types must be identical and this proves it wrong. I'll need to adjust it.

But also good that now we have a test for it. Apparently external tests do that a lot and we did not have even one test case covering it.

cameel avatar Apr 06 '23 10:04 cameel

@ekpyron This is now ready for review.

Current status:

  • I extracted small bits into separate PRs. Only the second one is a dependency here.
    • #14103
    • #14104
  • There are a few smal leftovers from the checklist above that I decided to do in separate PRs:
    • The PR allows overloading. I still think we should not disallow it, but I prepared a follow-up PR that would do it: #14105.
    • @matheusaaguiar is working on the SMTChecker part (just a warning and tests, not full support).
    • Documentation.
    • Refactor to add a new CallKind instead of isSuffixCall annotation.

cameel avatar Apr 07 '23 14:04 cameel

I have just posted a summary of the current behavior of literal suffixes: https://github.com/ethereum/solidity/issues/13718#issuecomment-1503203213. I recommend reading it before reviewing.

cameel avatar Apr 11 '23 12:04 cameel

I updated #14105 to use the method suggested by @chriseth and it does work, though IMO error messages are very suboptimal. I think we'd still be better off handling that case specially and issuing a better message (which requires the m_currentFunctionCall hack).

It works so I added the changes from that PR here. Overloading is now disallowed.

cameel avatar Apr 12 '23 10:04 cameel

I updated #14105 to use the method suggested by @chriseth and it does work, though IMO error messages are very suboptimal. I think we'd still be better off handling that case specially and issuing a better message (which requires the m_currentFunctionCall hack).

It works so I added the changes from that PR here. Overloading is now disallowed.

Never pay for minorly improved error messages by majorly increased code complexity, it's not worth it - the error messages here will be perfectly fine.

ekpyron avatar Apr 12 '23 11:04 ekpyron

I dealt with all the comments and also added some minor tweaks (mostly tests, but also things like mentioning location in error message for reference types).

Also note that the docs are ready for review in #14121.

EDIT: SMTChecker support is done in #14118 and also can be merged into this PR when approved.

Also is one of the refactors from the parent issue is in #14124, but note that it's independent of suffixes and the suffix PR can be merged without it.

In general, you can safely focus only on suffix PRs with the roadmap label.

cameel avatar Apr 14 '23 18:04 cameel

We decided not to merge this feature so I'm closing the PR. See https://github.com/ethereum/solidity/issues/13718#issuecomment-1551488965 for details.

cameel avatar May 17 '23 17:05 cameel