hlint
hlint copied to clipboard
HLint slow on larger files
% wc cabal-install/Distribution/Client/Setup.hs
2709 12689 112567 cabal-install/Distribution/Client/Setup.hs
With hlint-3.1
:
% time hlint cabal-install/Distribution/Client/Setup.hs
hlint cabal-install/Distribution/Client/Setup.hs 3,64s user 0,09s system 99% cpu 3,741 total
Where hlint-2.0.15
:
hlint cabal-install/Distribution/Client/Setup.hs 1,19s user 0,06s system 98% cpu 1,261 total
i.e. I see 3x regression in performance.
Hi @phadej ! Can you please advise on your GHC version? Is it 8.10.1 by any chance?
In the event it is, would you mind reporting your benchmark with an HLint built this way?
cabal new-build --constraint "hlint +ghc-lib" --constraint "ghc-lib-parser-ex -auto" --constraint "ghc-lib-parser-ex -no-ghc-lib"
I don't expect it to make a difference but still would be good to be sure.
I’m using GHC-8.8.3 (HLint-2.0 was compiled with GHC-8.6.5)
On 8. May 2020, at 15.39, Shayne Fletcher [email protected] wrote:
In the event it is, would you mind reporting your benchmark with an HLint built this way?
cabal new-build --constraint "hlint +ghc-lib" --constraint "ghc-lib-parser-ex -auto" --constraint "ghc-lib-parser-ex -no-ghc-lib" — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Thanks for the report and the test case. Profiling shows we burn about 12% lexing, 68% of the user-defined hints, 18% on the built-in hints. Zooming into the user-defined hints, 50% (of total runtime) goes into unifyExpr, of which 26% goes into scope matching. I've found and fixed a few super-low-hanging fruit (~0.3s), but it's mostly shuffling the time around a bit. What we really need to do is:
- Figure out the right semantics of scope matching, since there are quite a few hints where it's not quite right.
- Optimise it so that the module-portion of matching can be done before you get the variable name. There's a lot of looping over every listed import.
- Precompile the expression matching into a lookup machine rather than a linear traversal over all patterns one-by-one.
Naturally I'm keen to do the first step first, since optimising the wrong code can often make it harder to fix the right code. In the short term I don't think there will be many improvements though.
Testing 3.2 vs 2.1.17, Duplicate
is slow.
3.2
Timing Initialise
0.00s global flags
0.00s TOTAL
Timing Parse
0.08s ProjectPlanning.hs
0.08s TOTAL
Timing Config
0.09s data/hlint.yaml
0.09s TOTAL
Timing Hint
2.98s Duplicate
1.42s Match apply
0.26s Extensions
0.12s List
0.07s Bracket
0.06s Pattern
0.02s Lambda
0.01s ListRec
0.01s Monad
0.00s Other items (7)
4.96s TOTAL
Timing TOTAL
4.96s Hint
0.09s Config
0.08s Parse
0.00s Initialise
5.13s TOTAL
Took 6.11s
2.1.17
Timing Config
0.04s /cabal/store/ghc-8.6.5/hlint-2.1.17-ae0b98d92ed036e15c5c3d8506e30faa79e6751fbdf5b13c105857a1806c08f5/share/hlint.yaml
0.04s TOTAL
Timing Parse
0.13s ProjectPlanning.hs
0.13s TOTAL
Timing Hint
1.60s Match apply
0.09s Extensions
0.08s Bracket
0.07s Import
0.04s Pattern
0.03s List
0.01s Duplicate
0.01s Lambda
0.01s ListRec
0.01s Other items (7)
1.94s TOTAL
Timing TOTAL
1.94s Hint
0.13s Parse
0.04s Config
2.10s TOTAL
Took 2.12s
Duplicate is gone as of #1150 and v3.2.1 so that shouldn't be an issue anymore.
Setup.hs
as in the original issue
HLint 3.2.1
Timing Initialise
0.00s global flags
0.00s TOTAL
Timing Parse
0.08s ./cabal-install/src/Distribution/Client/Setup.hs
0.08s TOTAL
Timing Config
0.10s data/hlint.yaml
0.10s TOTAL
Timing Hint
1.47s Match apply
0.30s Extensions
0.11s Pattern
0.11s Bracket
0.10s Duplicate
0.08s List
0.03s Lambda
0.01s ListRec
0.01s Monad
0.01s Other items (7)
2.23s TOTAL
Timing TOTAL
2.23s Hint
0.10s Config
0.08s Parse
0.00s Initialise
2.41s TOTAL
Took 3.05s
HLint 2.1.17
Timing Config
0.04s /cabal/store/ghc-8.6.5/hlint-2.1.17-ae0b98d92ed036e15c5c3d8506e30faa79e6751fbdf5b13c105857a1806c08f5/share/hlint.yaml
0.04s TOTAL
Timing Parse
0.09s ./cabal-install/src/Distribution/Client/Setup.hs
0.09s TOTAL
Timing Hint
0.78s Match apply
0.07s Extensions
0.07s Import
0.05s Pattern
0.03s Bracket
0.02s List
0.01s Duplicate
0.01s Lambda
0.00s ListRec
0.01s Other items (7)
1.04s TOTAL
Timing TOTAL
1.04s Hint
0.09s Parse
0.04s Config
1.17s TOTAL
Took 1.21s
Interactive use of HLint
is still not an option :(
The two that are most impacted are the ones where Uniplate optimisations make the biggest differences. It's possible we've fallen off the Uniplate optimisations, which would explain the divergence. Worth checking. #712 might be the thing to investigate - turning on the Uniplate speed checker.
See https://github.com/ndmitchell/hlint/issues/1151#issuecomment-721886749 for some additional performance measurements. Match again seems an order of magnitude greater than the rest.
@ndmitchell
Precompile the expression matching into a lookup machine rather than a linear traversal over all patterns one-by-one.
So per "Flag matcher" the AST is traversed once right now?
@mbj per AST node, all hints are traversed once is probably a more accurate way of saying it
FWIW, this is till the case, and working on large files is still unacceptably slow:
aeson master % time hlint src/Data/Aeson/Types/ToJSON.hs
No hints
hlint src/Data/Aeson/Types/ToJSON.hs 6,25s user 0,21s system 99% cpu 6,474 total
aeson master % hlint --version
HLint v3.8, (C) Neil Mitchell 2006-2024
It takes less time to compile that file with optimisations.
In fact, it seems hlint got slower (I think I have the same machine still as when I opened this issue):
% time hlint cabal-install/src/Distribution/Client/Setup.hs
No hints
hlint cabal-install/src/Distribution/Client/Setup.hs 5,56s user 0,10s system 100% cpu 5,655 total
(to be fair, that file has grown a bit).