aspnetcore icon indicating copy to clipboard operation
aspnetcore copied to clipboard

Improve route analyzer performance with highly concatenated strings

Open JamesNK opened this issue 5 months ago • 1 comments

Fixes https://github.com/dotnet/aspnetcore/issues/54293

Route analyzer introduced in .NET 8 is causing very long build times if some projects. Highly concatenated strings caused a lot of time to be spent in HasLanguageComment.

The PR:

  • Changes analyzer's node iteration from DescendantTokens to use a stack. This matches Roslyn's regex analyzer.

  • Caches the last analyzed string and uses it when searching for language comment. This allows the loop that checks all parents to exit early.

    For example, analysis of var s = "a" + "b" + "c" will:

    • Analyze "a" for a language comment. It doesn't have one. This info about "a" is cached.
    • Analyze "b" which will check itself and then its parent "a". Use the previously cached value. Info about "b" is cached.
    • "c" then uses "b", and so on.

    This keeps analysis constant and prevents HasLanguageComment from O(N^2) complexity in highly concatenated strings.

  • Adds a test that recreates https://github.com/dotnet/aspnetcore/issues/54293.

    • Before change: 5 seconds
    • After change: ~0.2 seconds~ 0.05 seconds

This PR will likely be backported to 8.0.

JamesNK avatar Mar 11 '24 05:03 JamesNK

@jjonescz, @CyrusNajmabadi, @cston Roslyn expert opinion on this fix would be great.

JamesNK avatar Mar 12 '24 01:03 JamesNK

I've greatly simplified this PR.

The new workaround is to avoid calling GetLeadingTrivia() on every level of a nested binary expression. Instead, check the first parent of the string node and then resume checking once out of the nested binary expression tree.

Detection behavior doesn't 100% emulate embedded language classification by Roslyn, but I've found some odd quirks when mixing multiple lang comments in a string concat tree from Roslyn anyway.

The analyzer behavior here is good enough for normal usage. It's the best balance I've found between good behavior, good performance, and minimal code change.

JamesNK avatar Mar 18 '24 14:03 JamesNK

image

CyrusNajmabadi avatar Mar 20 '24 17:03 CyrusNajmabadi

@JamesNK i just went with https://github.com/dotnet/roslyn/pull/72620 as a simple approach.

CyrusNajmabadi avatar Mar 20 '24 18:03 CyrusNajmabadi

@JamesNK i just went with dotnet/roslyn#72620 as a simple approach.

Perfect, I like it.

Trying to match Roslyn behavior while getting good perf was wrapping this PR up in knots. Great that Roslyn just simplified, and this PR can copy it.

Performance with this change is good - 0.09s after compared to 5s before

JamesNK avatar Mar 21 '24 03:03 JamesNK

/azp run

JamesNK avatar Mar 21 '24 03:03 JamesNK

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines[bot] avatar Mar 21 '24 03:03 azure-pipelines[bot]

/azp run

JamesNK avatar Mar 21 '24 23:03 JamesNK

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines[bot] avatar Mar 21 '24 23:03 azure-pipelines[bot]

/backport to release/8.0

JamesNK avatar Mar 26 '24 02:03 JamesNK

Started backporting to release/8.0: https://github.com/dotnet/aspnetcore/actions/runs/8429842063

github-actions[bot] avatar Mar 26 '24 02:03 github-actions[bot]