haskell-language-server Optimization: cache semantic lookup in tokenizer, add semantic tokens bench marks

It is high chance that an identifier might appear multiple times in a file, cache the identifier hs semantic type result in tokenizer saves repeated computation.
A new test case is added to ensure invisible token won't not sabotage the cache lookup
Add bench mark for semantic tokens
fix bench mark running issue.

It gains about 5% performance increasement to the GetSemanticTokens Rule.

Feb 12 '24 10:02 soulomoon

could we have some benchmarks on large files to see if this actually makes a difference?

Feb 15 '24 07:02 wz1000

WIth

  - name: cabal
    package: Cabal
    version: 3.6.3.0
    modules:
        - src/Distribution/Simple/Configure.hs

cabal bench -j --benchmark-options="profiled-cabal"

version	configuration	name	success	samples	startup	setup	userT	delayedT	1stBuildT	avgPerRespT	totalT	rulesBuilt	rulesChanged	rulesVisited	rulesTotal	ruleEdges	ghcRebuilds	maxResidency	allocatedBytes
upstream	All	semanticTokens	True	50	2.09	0.00	9.37	0.03	3.10	0.13	9.40	5147	5146	5147	5908	31251	1	174MB	13491MB
cache-semantic-lookup	All	semanticTokens	True	50	1.87	0.00	8.36	0.10	2.77	0.11	8.46	5147	5146	5147	5908	31251	1	173MB	13437MB

Feb 15 '24 17:02 soulomoon

Looks like a decent improvement. Could you add this benchmark to the ghcide-bench experiments please?

Feb 15 '24 17:02 wz1000

the ghcRebuilds number seems to indicate that we aren't editing the file in the benchmark passes. Could you ensure that each benchmark iteration also edits the file?

Feb 15 '24 17:02 wz1000

the ghcRebuilds number seems to indicate that we aren't editing the file in the benchmark passes. Could you ensure that each benchmark iteration also edits the file?

Sure, I will make some changes

Feb 15 '24 17:02 soulomoon

I need to take some more time to come up a more decent benchmark on this. Convert it to draft now.

Feb 15 '24 17:02 soulomoon

The overal improvement is not really obivious. Slightly better mem usage and speed.

version	configuration	name	success	samples	startup	setup	userT	delayedT	1stBuildT	avgPerRespT	totalT	rulesBuilt	rulesChanged	rulesVisited	rulesTotal	ruleEdges	ghcRebuilds	maxResidency	allocatedBytes
upstream	All	semanticTokens	True	100	2.00	0.00	128.78	1.27	3.41	0.63	130.06	12	10	1364	5914	31271	102	296MB	568821MB
cache-semantic-lookup	All	semanticTokens	True	100	1.92	0.00	126.47	2.11	3.43	0.62	128.60	12	10	1364	5914	31271	102	290MB	562533MB

GetSemanticRules stably gives better results

Another result.

Feb 16 '24 14:02 soulomoon

After update to master, seems it does show decent improvement.

version	configuration	name	success	samples	startup	setup	userT	delayedT	1stBuildT	avgPerRespT	totalT	rulesBuilt	rulesChanged	rulesVisited	rulesTotal	ruleEdges	ghcRebuilds	maxResidency	allocatedBytes
upstream	All	semanticTokens	True	100	1.99	0.00	148.55	2.80	3.75	0.73	151.36	12	10	1364	5914	31271	102	307MB	594060MB
cache-semantic-lookup	All	semanticTokens	True	100	1.93	0.00	134.02	1.15	6.84	0.64	135.18	12	10	1364	5914	31271	102	318MB	562002MB

Feb 16 '24 14:02 soulomoon

Another run, the overall result is similar, but the detailed traces comparison indeed show some decent improvement.

version	configuration	name	success	samples	startup	setup	userT	delayedT	1stBuildT	avgPerRespT	totalT	rulesBuilt	rulesChanged	rulesVisited	rulesTotal	ruleEdges	ghcRebuilds	maxResidency	allocatedBytes
upstream	All	semanticTokens	True	50	1.83	0.00	64.00	1.18	3.50	0.62	65.18	12	10	1364	5914	31271	52	298MB	287351MB
HEAD	All	semanticTokens	True	50	1.91	0.00	64.61	0.70	3.60	0.62	65.32	12	10	1364	5914	31271	52	299MB	285555MB

Feb 18 '24 04:02 soulomoon

cc @wz1000 , I have produced some more accurate result, seems about 5% improvement to GetSemanticTokens Rule. Should be more if we consider the tokenizer alone.

Feb 18 '24 05:02 soulomoon

I have produced some more accurate result, seems about 5% improvement to GetSemanticTokens Rule. Should be more if we consider the tokenizer alone.

FWIW I'm unsure if that's a big enough difference to make this worth it. It might be better just to keep the simpler code (we should definitely add the benchmark, though). Is the improvement bigger on bigger files?

Feb 28 '24 08:02 michaelpj

I have produced some more accurate result, seems about 5% improvement to GetSemanticTokens Rule.

Should be more if we consider the tokenizer alone.

FWIW I'm unsure if that's a big enough difference to make this worth it. It might be better just to keep the simpler code (we should definitely add the benchmark, though). Is the improvement bigger on bigger files?

The bench is running on files with 2.5k and 1k lines of code in cabal package.

agree，not much different since the current computation is relatively simple, perhaps I should turn it into draft and revisit this if/when more features are introduced that make the computation heavy. （btw, the semantic tokens bench and the bench fix are already included his-graph patch)

Feb 28 '24 10:02 soulomoon