dafny icon indicating copy to clipboard operation
dafny copied to clipboard

Deduplicate verification diagnostic reporting between CLI and server

Open keyboardDrummer opened this issue 1 year ago • 2 comments

Description

Mature the IPhase concept by bigger use of it in more locations. In particular, it is now used for verification and verification coverage diagnostics.

Specific changes:

  • Let Compilation (the back-end) report verification diagnostics, instead of the two front-ends IdeState.HandleBoogieUpdate (IDE) and VerifyCommand.ReportVerificationDiagnostics (CLI), reducing duplication.
  • Let Compilation call WarnAboutSuspiciousDependenciesForScope instead of IdeState.HandleBoogieUpdate (IDE) and VerifyCommand.LogVerificationResults (CLI), reducing duplication.
  • IDE only: store phase state in a tree, to enable quickly finding and removing nodes from the tree.
  • CLI only: sort diagnostics using phases

How has this been tested?

  • This is a refactoring, no additional tests are needed

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

keyboardDrummer avatar Mar 25 '24 14:03 keyboardDrummer

I don't have a mental model good enough to understand the intricacies of your phase model. What gives you confidence it is correct? I'm willing to trust you on the design,

Actually, I would appreciate it you could look at it critically. Could you comment first on what user experience you think is best, and then on what you think of the solution I've chosen? I'll now give some background and thoughts.

For a long time, the Dafny IDE has migrated verification results, combining old verification diagnostics with fresh parsing/resolution diagnostics. This seems like a good idea because verification is slow and while fixing a verification error, you probably want to keep seeing the error you're fixing, but also get fresh parse and resolution errors.

That said, at some point you might have changed your proof so much that you no longer care about the previous verification error, so still seeing outdated verification errors could be frustrating. Maybe verification diagnostics should only be migrated as long as the assertion, but not the proof, remain the same, although I don't know if this concept is even well-defined. Another thing is that I have never seen other IDEs report outdated diagnostics. I don't know if that's because they never show them, or because they do show them but so shortly that they believe it's fine not to make it explicit.

Back to migration: when something other than verification becomes slow, say resolution, then you run into a similar situation but with different phases. The example I found jarring was that I had a parse error, fixed it, but Dafny did not show me that I fixed it because it was still computing resolution diagnostics, and parse+resolution diagnostics were only updated after both phases finished. This gave me the idea that it would be better to think about migration and phases holistically, instead of having logic that is specific to particular Dafny-phases.

The solution I came up with was that the back-end, when it emits that it has fresh compilation results, such as diagnostics or a mapping used for code navigation, also mentions which phase these came from. For the IDE is important to know which phases exist for a particular compilation, because it can use that to know which results from previous compilations are no longer relevant, which happens if that phase no longer exists in a new compilation. For example, if you add {no-verify} to a method, then you don't want to see outdated verification results for it. Also, suppose we want to implement assertion errors only being migrated if obligation of the assertion did not change, then we could put the assertion in its own phase, and have the key of that phase be the obligation.

Because phases are discovered during compilation, it is not possible to know at the start of compilation which phases there are. To record when all phases of a particular type have been discovered, phases are ordered in a tree. If a particular node in the tree then says "I have found all my phases", then we know the list of children under it is complete, and the IDE can dispose results from old child phases that are not in the new list of children.

If results for a particular phase are found, then the IDE can use this to replace migrated results for the same phase.

An example of a phase tree would be:

  • Parsing
    • File1.dfy
    • File2.dfy
  • Resolution
    • Module1
    • Module2
    • Module3
  • Verification
    • Method1
      • AssertionBatch1

A difficulty with migration of old compilation state, is how to prevent the IDE from having the same sort of state both form an old and a new compilation. For example, we don't want the user to see the same diagnostic twice. I don't have a good understanding of whether this is a difficult problem, and I'm not sure it's easy to figure that out without using phases more. So far it has not been a problem.

Up till now I've talked about what phases can do for the IDE, but what they can do for the CLI is help with the ordering of diagnostics. If we decide to make the CLI run more concurrently, such as doing resolution concurrently, then having the diagnostics automatically be ordered based on their phase would be nice.

My long term goal for the IDE is to separate what is Dafny specific code and what is not. I believe being able to think about things in a Dafny agnostic way can make them easier to reason about. Phases and all their related code seem Dafny agnostic, so I'd be happy if we can move that logic to a Dafny-agnostic assembly.

but wanted to see if you had any insights about all this new code that contains code that is going to run concurrently.

I was a bit frustrated that I needed to add something complicated simply due to a refactoring. This would seem to be opposite to the point of doing a refactoring. However, previously the verification dependency analysis was called by the front-ends (CLI and IDE server) after verification of a symbol was fully completed, which made it easy to compute the warnings in the (fixed) order of the implementations. Now the verification dependency analysis is called by the back-end (Compilation) and is done after each implementation finishes. It is up to the front-ends to decide whether to immediately show those results, or to wait until the entire verifiable is done and then show them in order. The IDE front-end will immediately show the results, and the CLI front-end uses this new PhaseOrderedDiagnosticsReporter to order phases and only show diagnostics if the previous phase has finished, which achieves a fixed ordering of diagnostics that the tests need to pass reliably.

Optionally, we can let the CLI only use phases to order the diagnostics when a particular option --ordered-diagnostics is used, that the tests would then use. This would also make it so that diagnostics show up faster on the CLI, in particular if we make compilation run more concurrently (for example by parsing and resolving concurrently).

keyboardDrummer avatar Apr 08 '24 13:04 keyboardDrummer

Thanks for this thorough design review! Here is my a-priori over two best UX design principles for diagnostics

  • Diagnostics should be shown closest to the point where users are likely to perform an edit.
  • Diagnostics should provide enough context so that user generally knows what to edit
  • User's attention should always be brought to diagnostics that they should edit next (by importance first, and then top/down)
  • Diagnostics should ideally not disappear until it's certain that they are no longer valid (either because fixed or refreshed), to avoid the "ah I fixed it as well... ah no".
  • It's ok for diagnostics to disappear to reappear, it can help users realize something is about to be refreshed, it helps comparing diagnostics before/after
  • If diagnostics disappear to reappear, it should be not more than one second or so because it corresponds to user's acceptance of waiting in an IDE. If diagnostics disappear, user is ok waiting for one second to see if they reappear, otherwise it will believe they solved the problem.
  • Earlier phases are more important than later phases when reporting diagnostics.

Given my view above, I like the phase system which is more Dafny agnostic, the way you describe it at least.

MikaelMayer avatar Apr 08 '24 19:04 MikaelMayer