haskell-language-server icon indicating copy to clipboard operation
haskell-language-server copied to clipboard

hls-graph speculatively builds all prior dependencies for a key even if they might have changed

Open wz1000 opened this issue 1 year ago • 11 comments

I ran into this bug while debugging a flaky test case -

A.hs is initially not an FOI, so GetModIface A.hs has two dependencies IsFileOfInterest A.hs and GetModIfaceFromDiskAndIndex A.hs. Then A.hs is opened, turning it into an FOI

https://github.com/haskell/haskell-language-server/blob/dfd8f0b402a47588f8245b384dbd18a741dbf4f1/ghcide/src/Development/IDE/Core/Rules.hs#L934-L957

At this point A.hs should have dependencies IsFileOfInterest A.hs, Typecheck A.hs ... (the IsFOI branch). However, I was seeing GetModIfaceFromDiskAndIndex still getting run for A.hs.

Investigating further, it turns out to be the implementation of refresh in hls-graph:

https://github.com/haskell/haskell-language-server/blob/dfd8f0b402a47588f8245b384dbd18a741dbf4f1/hls-graph/src/Development/IDE/Graph/Internal/Database.hs#L145-L148

It seems like it speculatively builds all the previous dependencies of the key even if they might have changed. This is wrong as rules can have side effects or require preconditions (GetModIfaceFromDiskAndIndex has both). This also can lead to spurious diagnostics from these pseudo-dependencies.

We should instead build dependencies in the order that they occurred (which we currently do not maintain), and recompute with RunDependenciesChanged if any of them are dirty, short circuiting the evaluation of later dependencies (since they might not be true dependencies). However, this might have a negative impact on performance.

In the above case, that would mean skipping GetModIfaceFromDiskAndIndex after we discover that IsFOI has changed.

wz1000 avatar Dec 23 '22 15:12 wz1000

@pepeiborra thoughts?

wz1000 avatar Dec 23 '22 16:12 wz1000

It seems like it speculatively builds all the previous dependencies of the key even if they might have changed.

It doesn't build the previous dependencies, only recomputes them. This can be a no-op if the dependencies are not dirty.

Either way, this behaviour is by design, to implement early cutoff. That's my understanding at least. And as far as I remember, Shake does the same thing.

This is wrong as rules can have side effects or require preconditions (GetModIfaceFromDiskAndIndex has both). This also can lead to spurious diagnostics from these pseudo-dependencies.

Yup, I can see how the side effects and the spurious diagnostics can be annoying. Maybe GetModIfaceFromDiskAndIndex needs to check whether the file is a FOI.

Not sure what you mean about preconditions?

We should instead build dependencies in the order that they occurred (which we currently do not maintain),

The build system makes no promises around evaluation order, and for good reason since we want maximal parallelism. The only promise is that if rule B depends on rule A the build system guarantees that A will build before B.

and recompute with RunDependenciesChanged if any of them are dirty, short circuiting the evaluation of later dependencies (since they might not be true dependencies). However, this might have a negative impact on performance.

So you want to give up early cutoff? I suspect the performance impact will be massive.

pepeiborra avatar Dec 23 '22 18:12 pepeiborra

It doesn't build the previous dependencies, only recomputes them. This can be a no-op if the dependencies are not dirty.

What is the distinction between 'build' and 'compute' here? It seems like all the code in the GetModIfaceFromDiskAndIndex rule is being executed.

Not sure what you mean about preconditions?

GetModIfaceFromDiskAndIndex is only supposed to be called on non-FOIs.

The build system makes no promises around evaluation order, and for good reason since we want maximal parallelism. The only promise is that if rule B depends on rule A the build system guarantees that A will build before B.

We could add an ordering for dependencies so that dependencies using applicative style or uses can be built in parallel, but monadic bind imposes a sequencing order on them.

So you want to give up early cutoff? I suspect the performance impact will be massive.

I'm not sure why this means we have to give up early cutoff. After recomputing previous dependencies, since the rule has finished execution, we should have the result and be able to do early cutoff using that?

wz1000 avatar Dec 23 '22 19:12 wz1000

What is the distinction between 'build' and 'compute' here? It seems like all the code in the GetModIfaceFromDiskAndIndex rule is being executed.

Computing a build target involves checking for dirtiness, computing its dependencies, and then running the rule giving it a RunMode:

  • If the target is not dirty, the rule won't be run
  • If the RunMode is RunDependenciesSame, defineEarlyCutoff will return the last cached value (this is what implements early cutoff)

GetModIfaceFromDiskAndIndex is only supposed to be called on non-FOIs.

This is not a property of the build rule, but an invariant that we want to maintain. It seems difficult to encode this invariant in the build system though

pepeiborra avatar Dec 23 '22 19:12 pepeiborra

So the gist of your proposal is to not evaluate all the dependencies in parallel. Instead, evaluate them "only as much as needed", and stop as early as possible, i.e. short-circuit. Is that right?

I am unconvinced that this will prevent running GetModIfaceFromDisk for FOIs. Can you illustrate with an example?

It will also make the build system less performant in subtle ways, specially when combined with ApplicativeDo.

pepeiborra avatar Dec 23 '22 19:12 pepeiborra

So the gist of your proposal is to not evaluate all the dependencies in parallel. Instead, evaluate them "only as much as needed", and stop as early as possible, i.e. short-circuit. Is that right?

Yes.

I also noticed in profiles that hls-graph spawns essentially a new thread for each build target, which seems excessive and probably not very good for performance (see our old friend https://gitlab.haskell.org/ghc/ghc/issues/18224). Of course the exact improvement depends on how many threads are blocked on MVars or IO, versus have actual work that can be performed.

I am unconvinced that this will prevent running GetModIfaceFromDisk for FOIs. Can you illustrate with an example?

At least in the case I mentioned in the ticket body, the second time GetModIface is built, hls-graph would first check if IsFileOfInterest has changed (which it has), and then skip computing the rest of the dependencies.

wz1000 avatar Dec 26 '22 09:12 wz1000

On a project with 3000 modules, it looks like hls-graph spawns and hold on to 31000 TSOs for the duration of the entire build.

wz1000 avatar Dec 26 '22 10:12 wz1000

I also noticed in profiles that hls-graph spawns essentially a new thread for each build target, which seems excessive and probably not very good for performance (see our old friend https://gitlab.haskell.org/ghc/ghc/issues/18224). Of course the exact improvement depends on how many threads are blocked on MVars or IO, versus have actual work that can be performed.

This is by design, hls-graph is an evolution of Shake, removing all the unessential including the fancy scheduler, which is replaced by the RTS scheduler.

depends on how many threads are blocked on MVars or IO, versus have actual work that can be performed.

Build target threads block on blackholes, IIRC, see splitIO. Of course those blackholes can be blocking on other concurrency primitives but that's up to the build rule.

On a project with 3000 modules, it looks like hls-graph spawns and hold on to 31000 TSOs for the duration of the entire build.

Sounds about right. Rebuilds after edits should spawn much fewer TSOs though, have you measured that?

pepeiborra avatar Dec 26 '22 10:12 pepeiborra

So the gist of your proposal is to not evaluate all the dependencies in parallel. Instead, evaluate them "only as much as needed", and stop as early as possible, i.e. short-circuit. Is that right?

Yes.

This makes sense to me, but I still have concerns about the performance impact. Your suggestion of grouping dependencies and evaluating the group in parallel would go a long way here. So I agree that this would be a good change.

pepeiborra avatar Dec 26 '22 21:12 pepeiborra

In summary. The current approach pros: 1 easy to implement 2 can kick all dependencies concurrently.

Cons: 1 wrong semantic. 2 also performance penalty of running unnecessary rules which might be computational heavy.(that is why we branch FOI in the first place, some non-FOI only rules would actually be running even the file is opened... as in the issue head)

Some thoughts: A:

  1. Introduce a small dependency dsl with branching semantic to hls-graph so the dependencies only runs when it meet the preconditions, use it to branch the runs of dependencies.
  2. This actually also gives easy fix to a potential dual problem about missing dependencies(missing since it's parent does not run it the first time) that might not kick it's parent when dirty even if it's precondition is met. This can happen if the branching condition can change without the change of some rule.

B: Alternatively we rearrange the rules with branching dependencies which might be nasty.

soulomoon avatar Feb 18 '24 17:02 soulomoon

I am doing a proof of concept draft in https://github.com/haskell/haskell-language-server/pull/4087#issue-2145336043 Branching the deps with meta in the rule as the way alwaysrun do, to see if performance is working ok.

soulomoon avatar Feb 20 '24 21:02 soulomoon