Optimization: cache semantic lookup in tokenizer, add semantic tokens bench marks
- It is high chance that an identifier might appear multiple times in a file, cache the identifier hs semantic type result in tokenizer saves repeated computation.
- A new test case is added to ensure invisible token won't not sabotage the cache lookup
- Add bench mark for semantic tokens
- fix bench mark running issue.
It gains about 5% performance increasement to the GetSemanticTokens Rule.
could we have some benchmarks on large files to see if this actually makes a difference?
WIth
- name: cabal
package: Cabal
version: 3.6.3.0
modules:
- src/Distribution/Simple/Configure.hs
cabal bench -j --benchmark-options="profiled-cabal"
| version | configuration | name | success | samples | startup | setup | userT | delayedT | 1stBuildT | avgPerRespT | totalT | rulesBuilt | rulesChanged | rulesVisited | rulesTotal | ruleEdges | ghcRebuilds | maxResidency | allocatedBytes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| upstream | All | semanticTokens | True | 50 | 2.09 | 0.00 | 9.37 | 0.03 | 3.10 | 0.13 | 9.40 | 5147 | 5146 | 5147 | 5908 | 31251 | 1 | 174MB | 13491MB |
| cache-semantic-lookup | All | semanticTokens | True | 50 | 1.87 | 0.00 | 8.36 | 0.10 | 2.77 | 0.11 | 8.46 | 5147 | 5146 | 5147 | 5908 | 31251 | 1 | 173MB | 13437MB |
Looks like a decent improvement. Could you add this benchmark to the ghcide-bench experiments please?
the ghcRebuilds number seems to indicate that we aren't editing the file in the benchmark passes. Could you ensure that each benchmark iteration also edits the file?
the
ghcRebuildsnumber seems to indicate that we aren't editing the file in the benchmark passes. Could you ensure that each benchmark iteration also edits the file?
Sure, I will make some changes
I need to take some more time to come up a more decent benchmark on this. Convert it to draft now.
The overal improvement is not really obivious. Slightly better mem usage and speed.
| version | configuration | name | success | samples | startup | setup | userT | delayedT | 1stBuildT | avgPerRespT | totalT | rulesBuilt | rulesChanged | rulesVisited | rulesTotal | ruleEdges | ghcRebuilds | maxResidency | allocatedBytes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| upstream | All | semanticTokens | True | 100 | 2.00 | 0.00 | 128.78 | 1.27 | 3.41 | 0.63 | 130.06 | 12 | 10 | 1364 | 5914 | 31271 | 102 | 296MB | 568821MB |
| cache-semantic-lookup | All | semanticTokens | True | 100 | 1.92 | 0.00 | 126.47 | 2.11 | 3.43 | 0.62 | 128.60 | 12 | 10 | 1364 | 5914 | 31271 | 102 | 290MB | 562533MB |
GetSemanticRules stably gives better results
Another result.
After update to master, seems it does show decent improvement.
| version | configuration | name | success | samples | startup | setup | userT | delayedT | 1stBuildT | avgPerRespT | totalT | rulesBuilt | rulesChanged | rulesVisited | rulesTotal | ruleEdges | ghcRebuilds | maxResidency | allocatedBytes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| upstream | All | semanticTokens | True | 100 | 1.99 | 0.00 | 148.55 | 2.80 | 3.75 | 0.73 | 151.36 | 12 | 10 | 1364 | 5914 | 31271 | 102 | 307MB | 594060MB |
| cache-semantic-lookup | All | semanticTokens | True | 100 | 1.93 | 0.00 | 134.02 | 1.15 | 6.84 | 0.64 | 135.18 | 12 | 10 | 1364 | 5914 | 31271 | 102 | 318MB | 562002MB |
Another run, the overall result is similar, but the detailed traces comparison indeed show some decent improvement.
| version | configuration | name | success | samples | startup | setup | userT | delayedT | 1stBuildT | avgPerRespT | totalT | rulesBuilt | rulesChanged | rulesVisited | rulesTotal | ruleEdges | ghcRebuilds | maxResidency | allocatedBytes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| upstream | All | semanticTokens | True | 50 | 1.83 | 0.00 | 64.00 | 1.18 | 3.50 | 0.62 | 65.18 | 12 | 10 | 1364 | 5914 | 31271 | 52 | 298MB | 287351MB |
| HEAD | All | semanticTokens | True | 50 | 1.91 | 0.00 | 64.61 | 0.70 | 3.60 | 0.62 | 65.32 | 12 | 10 | 1364 | 5914 | 31271 | 52 | 299MB | 285555MB |
cc @wz1000 , I have produced some more accurate result, seems about 5% improvement to GetSemanticTokens Rule.
Should be more if we consider the tokenizer alone.
I have produced some more accurate result, seems about 5% improvement to GetSemanticTokens Rule. Should be more if we consider the tokenizer alone.
FWIW I'm unsure if that's a big enough difference to make this worth it. It might be better just to keep the simpler code (we should definitely add the benchmark, though). Is the improvement bigger on bigger files?
I have produced some more accurate result, seems about 5% improvement to GetSemanticTokens Rule.
Should be more if we consider the tokenizer alone.
FWIW I'm unsure if that's a big enough difference to make this worth it. It might be better just to keep the simpler code (we should definitely add the benchmark, though). Is the improvement bigger on bigger files?
The bench is running on files with 2.5k and 1k lines of code in cabal package.
agree,not much different since the current computation is relatively simple, perhaps I should turn it into draft and revisit this if/when more features are introduced that make the computation heavy. (btw, the semantic tokens bench and the bench fix are already included his-graph patch)