CatchableError static exception tracking pollited by base class Exception

Open mjfh opened this issue 3 years ago • 11 comments

Are there some idea, comments on better solution for the following problem?

Objective

The OP handlers for the re-factored VM2 should be annotated with raises: [Defect,EVMError]

Problem

Some vendor modules leak potential base class Exception exceptions, most annoyingly from the eth/trie modules. Also, ref object assignment seems to flag an Exception at compile time analysis (the nimbus style guide suggests an init() function.) See https://forum.nim-lang.org/t/6783#42459 for some discussion on that topic (I also realised that the not nil object attribute seems to have gone which never worked too well anyway.)

Partial Solution Applicable to Current Code

As a partial solution without extensive code changes, I implemented the following in the branch feature/vm2-annotate-raises-exceptions:

Exceptions are treated as uncatchable, only CatchableError is processed in functions calling OP handlers
two compile modes, debug and normal
- in debug mode, selected functions are wrapped so that the base class Exception is relayed as a defect, OP handlers are annotated with raises: [Defect,CatchableError]
- in normal mode, OP handlers are annorated with raises: [Exception] only

So this allows a static exception analysis for the debug version with comparable exception run-time behavior. BTW, running the test suite does not result in much execution time penalties (observed ~0.6%.)

May 07 '21 08:05 mjfh

ref object assignment seems to flag an Exception at compile time analysis

Can you provide a minimal example for this? I haven't stumbled on such a problem.

In general, you shouldn't be shy about introducing changes in the vendor packages that improve the exception tracking. The infectious general raises: [Exception] inferred effect is most often the result of non-annotated proc pointer types or methods. In our own libraries, we can easily fix such code by adding annotations. Third-party code, we can wrap using the raiseAssert tricks.

May 11 '21 21:05 zah

Example: nimbus/config.nim(~900):

     proc getConfiguration*(): NimbusConfiguration =
        ## ...
        if isNil(nimbusConfig):
           nimbusConfig = initConfiguration()
        result = nimbusConfig

Playing with that assignment and with the initConfiguration() implementation led me come to that conclusion.

May 12 '21 07:05 mjfh

The OP handlers for the re-factored VM2 should be annotated with raises: [Defect,EVMError]

I'm not convinced that this is a useful objective, as the purpose appears to be to wrap all other exception that can be produced into EVMError, only to unwrap them immediately at the next scope out where they are caught and turned into non-exception status codes - exceptions are not returned from the EVM, status codes are.

This is more obvious with EVMC, where the status is limited to a set of codes from an enum. It would make more sense to turn exceptions into the correct status, instead of returning a general error code and losing the information.

Although I do see the conceptual cleanliness motivation, there is already a try..except outside the core interpreter bytecode while-loop to wrap exceptions. It seems to me that's the right place for it, as an interpreter loop calling op functions for every bytecode, some of which are very short, should be performance focused at least in obvious ways. (try is not free in Nim. But if you care about micro-performance in tight loops, {.raises:[].} in future Nim is the way to go.)

Considering the enthusiasm you showed for jump-table micro-optimisation, whose benefit is lower than the cost of try, I am surprised by this.

If I have this all wrong, and the point is only to wrap exceptions around calls to host services like the database etc, that's different and the motivation for wrapping is clearer. In that case, though, you know the non-EVMC calls are going away, so all calls will be through the .host interface. It is not obvious from the declarations, but as a C API boundary between different "memory universes", it is similar to threads: It isn't safe for a Nim exception to cross that C API boundary, so normal exception rules don't apply. If any get across, it's a program-terminating defect.

May 26 '21 16:05 jlokier

So this allows a static exception analysis for the debug version with comparable exception run-time behavior. BTW, running the test suite does not result in much execution time penalties (observed ~0.6%.)

You're measuring the test suite, not the EVM execution time.

Much of that is JSON parsing, blockchain construction, hashing, database storage, reading 1.2 GB of small files from the OS, etc. all of which are not part of EVM.

Probably 0.6% you saw is noise (it's marginal), unless you averaged over many runs and randomly alternated branches to avoid stuck performance externalities.

But if it's truly 0.6% the EVM performance will be different by a larger percentage than that.

May 26 '21 17:05 jlokier

Well, it's also reasonable to define a EvmError enum type and to use it consistently in place of exceptions everywhere.

The primary goal of my suggestion is to ensure that all errors are being handled. The specific raises: [Defect, EVMError] approach was suggested only because of the existing practice of the EVM to use exceptions. Optimizing the performance aspects of error handling is something that we can look into at a later stage of the project where we are polishing our public releases. Correctness comes first.

May 26 '21 18:05 zah

There is an EvmError enum, but it is called evmc_status_code.

All errors are already handled in some sense, because the interpreter loop has try..except CatchableError around the loop, which turns any exception into an EVM error status immediately and doesn't reraise.

That's what I mean when I say any EVMError exception generated inside an op to wrap a generic "external" exception, is immediately disposed of at the next scope up. It is not obvious to me what benefit that provides in type safety.

I agree performance comes after correctness, but I'm missing the correcness issue that needs solving here. If instead we were talking about {.raises: [Defect].} (or {.raises: [].}, pick your poison), then I could see it.

May 27 '21 09:05 jlokier

{.push raises: [Defect, CatchableError].} is a reasonable annotation for code that raises random stuff and where you don't really care where it came from - this is appropriate in a few cases, specially with legacy code that relies on exceptions or has special reasons to use exceptions instead of more explicit error handling mechanism,

In general, the bar for "special reasons" is pretty high - ie it should have a demonstrated and concrete benefit which is not "I don't like seeing error handling code" - we have a policy of explicit error handling in place because the error cases are considered equally important to non-error cases.

In very very rare cases, performance might be one such reason - but even in these cases, each "logical" module should ensure that the errors that leak out of it are contained and don't infect outside code, and don't get infected by transitive dependencies in an uncontrolled manner.

May 27 '21 11:05 arnetheduck

I'm not clear on who is taking what position on this, as the conversation is confusing, so I'll sum up some facts which are pretty much how it has to be.

The exceptions are already contained. They are turned into failure values before leaving the EVM. This is not new.
If there are random-stuff exceptions escaping, and there may be, those are bugs to be fixed. Exceptions must not escape the EVM. Ideally they are turnd into appropriate EVMC status values at the earliest opportunity.
The medium-term goal is that internal errors become appropriate EVMC status values. For example there are codes for out-of-gas, insufficient balance, stack overflow, etc. Meanwhile, internal errors are (already) converted to the EVMC generic failure value, which is less informative. There is one special case, which is that REVERT must return a specific failure value, to inform the EVM caller how to behave differently in this case.
The EVM entry point should be annotated with {.raises: [Defect].} or {.raises: [].} because it's not supposed to leak any exceptions. In EVMC it must satisfy that annotation, because exceptions must not escape the library boundary. It's a crash, memory corruption or undefined behavior if this happens, so it's a bug that we currently don't have this annotation.
Callouts to things that may return random-stuff exceptions, e.g. database, trie code etc, are going away from the EVM. Because of how the EVMC interface works, their replacements (upcalls via host) tend to always succeed, or there is an abort by a different method than an exception. (Any important detail about the cause is known to the EVMC host but not the EVM.) In all these cases, the upcalls are {.raises: [].}, and any defects would be severe things that should abort the program immediately (such as nil procs when not expected).

Also this, listed separately because it's an opinion not fact:

If interpreter performance is a goal(*), I think it's fine to use exceptions deliberately as a hack for performance. To return the final status/result value with fewer tests in the opcode hot path. This is for success results as well, not just error returns.

This is just another classic interpreter optimisation, and I think it's probably why exceptions were used in the EVM in the first place, to reduce per-opcode dispatch burden.

However if doing this, it's important to recognise and explain this is not the normal form of exceptions, to represent unexpected errors. It's effectively a hack, but a good one. When doing this, the relevant try must go outside the dispatch loop otherwise the benefit of the hack is defeated.

(*) The code shows conflicting messages about whether it is or not, it's an odd mix. If the CI-of-the-day-dependent matrix at https://github.com/status-im/nimbus-eth1/blob/5d0d44c38f49d8645c87333f48fd6d7ae4dbe235/nimbus/vm2/interpreter_dispatch.nim#L68 is in there (by the way it might make the code slower not faster), why would you avoid classic interpreter optimisations in the dispatch loop?

May 28 '21 02:05 jlokier

The original issue here was about "raw" (base class) Exceptions leaking. I don't like the proposed solution for this with two different modes. It is typically not so difficult to fix these (when in our own repos). For example, several got fixed in trie & trie/database here: https://github.com/status-im/nim-eth/pull/349/files So I'd start with checking what is still causing raw Exceptions right now instead of complicating code with different modes.

If they can't be dealt with immediately, one approach is to except Exception and check if they are a Defect, raise those, if not, assert. This will hopefully give you an idea if they cause real trouble. And try to get them out on the longer run.

Regarding the "random" exceptions, I agree with @arnetheduck his points there. But this is really more of a lesser problem if they are dealt with already compared to the raw Exceptions where you are not sure about what might cause them.

May 28 '21 07:05 kdeme

The goal of the raises annotations is to make the exceptions more apparent in the code. If you have a general CatchableError handler at the top-level, it will indeed stop all exceptions, but you may miss good opportunities to handle specific exception earlier, closer to the point where they were raised and more importantly in a place where handling them would have allowed the program to perform better overall.

In other words, when you start adding raises annotations you are often making trivial discoveries such as "oh, parseInt can raise a ValueError, I should handle it here".

May 28 '21 10:05 zah

The goal of the raises annotations is to make the exceptions more apparent in the code. If you have a general CatchableError handler at the top-level, it will indeed stop all exceptions, but you may miss good opportunities to handle specific exception earlier, closer to the point where they were raised and more importantly in a place where handling them would have allowed the program to perform better overall.

In other words, when you start adding raises annotations you are often making trivial discoveries such as "oh, parseInt can raise a ValueError, I should handle it here".

Yeah, completely agree on this. Saying "we are not letting any exceptions pass at the API call level so its fine" is not necessarily enough, as you might just not be aware of the possibility of a certain type of errors and thus the way they should be taken care off. I'm not sure if that is the case here as as don't know the code. And it is understandable for now considering legacy etc. I'm just trying to point out that getting rid of raw Exceptions is probably the better first step.

May 28 '21 11:05 kdeme

Closed as problem has become outdated with nim v1.2.16

Oct 20 '23 08:10 mjfh

nimbus-eth1 nimbus-eth1 copied to clipboard

CatchableError static exception tracking pollited by base class Exception

Objective

Problem

Partial Solution Applicable to Current Code

nimbus-eth1
nimbus-eth1 copied to clipboard