Redesign handling of aggregate rules in language server
Working on #1131 I hit some severe performance issues around aggregate rule violations, frequently leading to the language server to either hang, or heavily tax the CPU. The reason for this is that aggregate violations currently result in the entire workspace to be linted. Any change later in that package will again lead to the entire workspace to be linted, and so on... instead, I think we should do the following in the context of LSP (and only in that context):
- Separate aggregation from linting. I think we do this already internally, but calling
linter.Lint()will obviously do both. - On server startup, collect aggregate data from all aggregate rules, and store in the server cache.
- On file changes, collect aggregate data only from that file, and replace existing data for that file.
- On file changes, request linting of both normal rules and aggregate rules, using the aggregated data as input for the latter.
CC @charlieegan3 for ideas.
Yeah this sounds like a good plan to me. As I understand it, the aggregates are already set on the report, just never shown. So when we complete linting, we can stash these in the cache pretty easily.
I guess the core part that needs to be adjusted is
if len(input.FileNames) > 1 {
aggregateReport, err := l.lintWithRegoAggregateRules(ctx, regoReport.Aggregates, regoReport.IgnoreDirectives)
if err != nil {
return report.Report{}, fmt.Errorf("failed to lint using Rego aggregate rules: %w", err)
}
finalReport.Violations = append(finalReport.Violations, aggregateReport.Violations...)
}
Where we might want to have a new field in the input containing all the cached aggregates?
Anything more to do here, @charlieegan3 ?
There is always more we can do, but I think we can close this one until we have a new performance issue we want to focus on.