scala3 Introduce Best Effort compilation options

The aim of this ~~draft~~ PR is to introduce a compiler improvement for the use in IDEs. The idea was to allow for IDE's to work on erroring projects more easily, without any manual touchups (see https://github.com/scalameta/metals-feature-requests/issues/327). To achieve this, a tasty-like format that can include errored/incomplete trees can be produced by the compiler, and consumed by the presentation compiler. Though mostly finished, this is a draft PR, meant for the maintainers and contributors of dotty to review the approach presented here. I am not yet sure would would it take to actually merge a feature like this one, although it is worth noting that all of the changes presented here are hidden under experimental options.

Introduction

This PR introduces 2 new experimental options for the compiler: -Ybest-effort-dir and -Ywith-best-effort-tasty. It also introduces the Best Effort TASTy format, a TASTy aligned file format able to hold some errored trees. Behaviour of the options and the format is documented as part of this PR in the best-effort-compilation.md docs file.

Considerations

My overall aim was to be able to handle as many cases, with little as little maintainance necessary as possible. This is for example why pretty much the only required phases are Parser and Typer - they are enough for, as far as I know, all necessary metals completions and I did not find any issues with setting their products (best effort tasty files) as dependencies. Requiring, for example, PostTyper, would require for the errored trees to be able to somehow pass through that phase, meaning a large investment from me into working the existing known error cases through there (muddling the codebase in the process) and possibly from the maintainers working on both Typer (to be able to produce „correct” error trees) and PostTyper (to be able to consume „correct” errored trees), which would obviously make the entire initiative dreadful.

This is also why any tests are able to be put into a blacklist file, in case something changes and a neg test will not pass, or a new test will be added as part of issue fix that does not play well with the best-effort features.

Controversial aspects of the implementation

I wanted to relay some implementation aspects that may be seen as controversial, to more easily obtain feedback related to them (all welcome).

~~Allowed crashing - I imagine this may confuse users of metals, so those crashes should likely not be shown there in the logs. Handling every possible incorrect tree in the pickling phase would be quite a maintenance toll, so we allow the compiler to crash in some instances. If this proves to be a significant problem, one easy solution would be to catch any error in Best Effort TASTy pickling and unpicking and print some vague message. I also want to reiterate that this can only happen when using the newly added options.~~ EDIT: I added a separate error message wrapper for crashes obtained during bets-effort compilation.
~~Undocumented Best Effort TASTy grammar - or rather documented only by the implementation. Serializing incorrect trees (without defining any set of what these incorrectness constitute as, possibly excluding some possible errored trees in the process) is a difficult problem. This aspect leads to my implementation, where I’ve made it so the pickler is allowed to panic and not serialize certain subtrees, or serialize error types in their place (same with unpicking), so some subtrees may be incomplete. Also, if this was documented we would likely want to version it, which leads to…~~ EDIT: I added format grammar documentation in BestEffortTastyFormat.scala.
Best Effort Tasty versioning - ~~or lack thereof. Unpickling will still fail when when compiling with an earlier minor version, but above that I added no other requirements. I think the implementation adjustments should be allowed in the patch releases, so we can course correct in case of mistakes. However, I think as the feature stabilizes, the changes will stop happening. If we already allow some error trees to not work (and we have to), some cases not working across some patch releases does not seem like a big issue (whether anyone even uses multiple patch versions across one multimode build is another thing). Additionally would personally like to see this backported onto LTS one day, so that’s another reason.~~ EDIT: I still do not check separately best effort tasty versioning, but I reserved a space in the file to be able to do that if that becomes necessary.

The flipside of all of these quirks is that the feature in general works really well. Out of the ~~1312~~ 1559 failing „neg” test files checked, ~~1286~~ 1534 manage to pass the pickling and unpicking tests (most of the fails and crashes are caused by cyclic reference errors).

May 25 '23 11:05 jchyb

I'm not opposed to having a slightly different, best-effort format for non-compiling code. However, I am worried about the "intentionally unspecified" aspect of it. Citing from a comment in the PR:

The Best Effort TASTy format also informally extends the regular TASTy grammar to allow the handling of as large amount of incorrect trees produced by the compiler as possible. Because of this informality and unlike the regular TASTy format, the BETASTy format should stay internal to the compiler.

(emphasis mine). Yet the whole point is for that format to be read from IDEs, which are definitely not internal to the compiler. That is already a direct contradiction, and is a symptom of what I'm worried about.

If we go this route, we should in fact specify the best-effort tasty format. Ideally, it would be possible for something like tasty-query to consume it and still provide some amount of guarantees.

If we don't specify it, all we have is some random data produced by some compiler, which some IDEs may be able to process to some extent. I am perhaps exaggerating those "some"s for effect, but even so, that's not a good situation to be in.

What fundamentally prevents specifying this format? Are there known blockers? Can we solve them? Even if we cannot attach run-time (evaluation) semantics to the resulting symbol table and trees, we should be able to attach specified, compile-time (structural) semantics.

May 26 '23 12:05 sjrd

Sorry for the belated reply, I wanted to take care of some other issues before I came back to this.

By writing that it should stay internal to the compiler I meant that it should only be able to be read and written by the compiler. Since Metals uses the compiler directly (via the so-called presentation compiler) and never reads from the tasty itself, it is not an issue there.

The informality is there mostly because of the implementation. Accounting for unforeseen states of the compiler can be difficult. After all, we do aim to pass incorrect trees into the pickling phase, so since they are incorrect, reasoning about what will be found there is tough. So the solution I came up was to try to force pickling with a series of fallbacks, based on the already found neg-test of the compiler. If something cannot be pickled, in its place we pickle an ErrorType, Empty block, or nothing (depending on the situation). The main problem lies with unpicking. Since we pickle very naively like this, there are some situations where an assumption used in unpickler may not be correct, so we end up fallbacking in unpickler as well (made up example, since it is tough for me to remember specifics right now, we might fail to pickle a method, then while unpicking a class the method is expected there, so we stop unpickling that).

This may sound somewhat dire, but the two alternatives I know of are much worse to me:

rewrite completely pickler and unpickler for best effort - extremely high maintenance cost (not even mentioning implementation)
limit to easily identifiable cases, like simple Incorrect Type Errors, Missing Type, etc., and incorporate that into the current pickler and unpickler implementation - much less useful, in my opinion

So the be able to be formalized, it would need to be rewritten into one of the above.

Jul 03 '23 12:07 jchyb

Since we pickle very naively like this, there are some situations where an assumption used in unpickler may not be correct, so we end up fallbacking in unpickler as well (made up example, since it is tough for me to remember specifics right now, we might fail to pickle a method, then while unpicking a class the method is expected there, so we stop unpickling that).

IMO this is the core of the problem. Your proposal is to say "anything goes", and who knows whether unpickling will be able to recover from it. Instead, I propose to specify what the unpickler needs to be able to recover from. Pointing to a non-existent method? That's fine in a sense, with the caveat that we cannot compute a tpe for such a tree, so it becomes ErrorType as well. If those things are specified, then external tools can recover from the same situations.

Jul 03 '23 17:07 sjrd

If those things are specified, then external tools can recover from the same situations.

I think the idea is that this shouldn't be used by external tools currently or ever (?)

Jul 03 '23 17:07 tgodzik

I don't think that point of view makes sense. If it stayed within one project (in the sbt sense of the term), that would make sense. But as soon as you cross the project boundary, even "within the compiler" can mean "across different versions of the compiler". Or different options. Also, it means that build tools need to communicate these files around; likely take them into consideration for incremental compilation (you can't afford to drop down to full recompilation just because an upstream dependency has a best-effort output), etc. IMO, that means these things will de facto be read by other tools.

Jul 04 '23 07:07 sjrd

About the use in different compiler versions. My first idea was to limit the use to only the same patch compiler versions. This way we would be able to freely enact changes to the implementation, especially important for the LTS. This made sense to me as it seemed unlikely that many people would use different versions of scala 3 across different projects.

However, it's not like the current best-effort implementation has any guarantees to uphold anyway. It can still fail to produce/read betasty for rare cases, like for programs with certain circular type dependencies. So if we were to allow using betasty across different versions, and it would only work sometimes, this still might be better than artificially limiting it to a subset of project structures. As the implementation matures and is kept unchanged in subsequent compiler versions, it is going to fail for less and less cases. And that approach is always going to work for more cases than when limiting betasty use to the same compiler patch version. This is supported (somewhat by accident) by the fallback-based implementation where we may naively fallback both while writing and reading the betasty file. So ultimately, what I decided there was to keep the strong requirements about the minor versions from regular tasty file format, and still allow it to try reading (and perhaps failing to read) betasty from different patch versions.

You are right about the incremental compilation - we cannot really use it with the way best-effort compilation is now. But I don't think it can be used for failing projects anyway. Previously, modules depending on failing modules would just not be able to be compiled at all.

Jul 04 '23 12:07 jchyb

About the use in different compiler versions. My first idea was to limit the use to only the same patch compiler versions. This way we would be able to freely enact changes to the implementation, especially important for the LTS. This made sense to me as it seemed unlikely that many people would use different versions of scala 3 across different projects.

I think that scenario is highly unlikely and since we have only best-effort compilation, it's fine if that fails at times.

You are right about the incremental compilation - we cannot really use it with the way best-effort compilation is now. But I don't think it can be used for failing projects anyway. Previously, modules depending on failing modules would just not be able to be compiled at all.

It's highly unlikely that we could provide sensible incremental compilation here actually, only using the correctly compiled artifacts before. I don't think we should try to support it at this time for artifacts with potential errors.

@sjrd what kind of specification would you have in mind actually? How would that help the tooling? We would want to keep the changes to the compiler as maintainable as possible, so that we don't need to rewrite too many things.

@jchyb would it be possible to at least specify the cases you know of? So that we have a list for potencial tool formalization later?

Jul 05 '23 13:07 tgodzik

The best way forward is to decide that BETASTy files can only be read and written by the exact version of the compiler. This way, we can really treat them as an implementation detail. This will be beneficial as I believe there will be many reiterations and improvements along the way after shipping the first version. Maintaining any compatibility guarantees about BETASTy files is not our goal right now.

Later, when we are sure about the design and implementation, we can "open" the format by writing the specification and maintaining the same compatibility guarantees as TASTy does. Doing it prematurely may be counterproductive, as the design may change based on feedback.

Jul 05 '23 13:07 Kordyjan

Hello again, apologies for a lack of (visible) movement in this PR. In the meantime, I managed to add many improvements to the implementation here, to the point where I was also now able to specify the grammar used in the Best Effort TASTy files in a concise manner (found in BestEffortTastyFormat). With that in mind, I still agree with Paweł’s remarks above, and with the fact that for now the format should only be consumed by the compiler. However, to future proof the (now unchecked) compatibility concerns, I added and additional field (now unused, but reserved) for betasty versioning - though if deemed unnecessary for now, I can of course remove it. All in all, I consider the work here complete, and ready for review. A more concise explanation of the PR can be found in the commit message, which I think could be helpful for review.

Aug 30 '23 12:08 jchyb

Due to @jchyb temporal unavailability, I've been asked to resolve conflicts here. (I've rebased onto dotty-staging main and one conflict remains, but it can be easily solved in web editor)

Feb 20 '24 12:02 Florian3k

(I've rebased onto dotty-staging main and one conflict remains, but it can be easily solved in web editor)

dotty-staging main is quite phantom. You should rebase on top of lampepfl main. Then there wouldn't be any conflict left.

Feb 20 '24 12:02 sjrd

Okay, I've rebased on lampepfl main

Feb 20 '24 13:02 Florian3k

Hi @sjrd, I have now rebased the PR (and replied to some leftover review comments). Sorry about having done this late, I was not really available for the past few months. Please consider looking at this again when you have the time (the previous review was really helpful). The only real change here (outside of applying review suggestions) is that I copy/pasted one of the methods from tasty-core to BestEffortTastyHeader to appease the MIMA checks

Mar 26 '24 12:03 jchyb

@sjrd Thank you for the your last review. I have applied all of the suggestions and rebased on top of the latest PR's (including pipelining). I've also made sure to carefully recheck everything this time, however I only ended up adjusting the passesConditionForErroringBestEffortCode method, everything else is as it was before

Apr 17 '24 11:04 jchyb

@jchyb Consider squashing everything before merging. Also make sure the commit message is adequate (if we merge as is we take the PR description, which is outdated).

Apr 18 '24 07:04 sjrd

Thank you so much for all of the reviews and overall help with this @sjrd!

Apr 18 '24 09:04 jchyb

Congrats @jchyb ! Great work!

Apr 18 '24 09:04 tgodzik