NuGet.Client
NuGet.Client copied to clipboard
Reduce allocations in TokenSegment.TryMatch
Bug
Fixes: https://github.com/NuGet/Home/issues/12728
Also closes internal workitem.
Regression? Last working version:
Description
This uses ReadOnlyMemory<char>.Slice()
instead of string.Substring()
to reduce allocations when pattern matching. This also needs to cache the matches with a Dictionary<ReadOnlyMemory<char>, object>
in PatternTable
so I needed to re-use an ordinal comparer that I found in another repo. The rest of the changes are fallout from changing the pattern we're looking for from a string
to ReadOnlyMemory<char>
. The reason I'm using ReadOnlyMemory<T>
instead of ReadOnlySpan<T>
is that ReadOnlySpan<T>
is a readonly ref struct
and can't be used as a type parameter.
I had to change the public API of NuGet.ContentModel.ContentPropertyDefinition
and NuGet.ContentModel.PatternTable.TryLookup
. I'm not sure how widely those are used by anyone so I marked the associated issue as a "breaking change" just to be safe.
PR Checklist
-
[x] PR has a meaningful title
-
[x] PR has a linked issue.
-
[x] Described changes
-
Tests
- [ ] Automated tests added
- OR
- [x] Test exception - Updated existing tests
- OR
- [ ] N/A
-
Documentation
- [ ] Documentation PR or issue filled
- OR
- [x] N/A
@jeffkl Do you have any numbers on this?
@jeffkl What's blocking this from making progress?
@jeffkl What's blocking this from making progress?
I was the NuGet on-call engineer last week. Hoping to get back to this today or tomorrow, I'm also helping with the 1ES PT effort.
@jeffkl - I have a slightly different understanding of why this code path is causing too many allocations. My understanding is that the current pattern-matching algorithm is suboptimal. For every asset in the package, we check it against all the patterns to find a match. Instead, we could have some kind of state machine that we traverse with each asset exactly once to determine the asset type. For example, in the image below, I have added numbers (1 to 10) to denote roots. Each asset will navigate through only one root, and if it reaches the terminal state on that path, then there is a match, meaning it could be a runtime assembly, a native assembly, or something else.
@jeffkl - I have a slightly different understanding of why this code path is causing too many allocations. My understanding is that the current pattern-matching algorithm is suboptimal. For every asset in the package, we check it against all the patterns to find a match. Instead, we could have some kind of state machine that we traverse with each asset exactly once to determine the asset type. For example, in the image below, I have added numbers (1 to 10) to denote roots. Each asset will navigate through only one root, and if it reaches the terminal state on that path, then there is a match, meaning it could be a runtime assembly, a native assembly, or something else.
Team Triage: Filed a tech debt issue https://github.com/NuGet/Client.Engineering/issues/2780.
@jeffkl Do you have any numbers on this?
Before:
After:
Can you share out a trace?
From: Jeff Kluge @.> Sent: Friday, April 5, 2024 6:55 AM To: NuGet/NuGet.Client @.> Cc: David Kean @.>; Review requested @.> Subject: Re: [NuGet/NuGet.Client] Reduce allocations in TokenSegment.TryMatch (PR #5676)
@jeffklhttps://github.com/jeffkl Do you have any numbers on this?
Before: image.png (view on web)https://github.com/NuGet/NuGet.Client/assets/17556515/e3383573-3d28-40b8-8e27-ca1307d2c65a
After: image.png (view on web)https://github.com/NuGet/NuGet.Client/assets/17556515/1a869020-04fb-4ac0-875d-877487766b75
— Reply to this email directly, view it on GitHubhttps://github.com/NuGet/NuGet.Client/pull/5676#issuecomment-2038095391, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAINQIQFJT2BMFOPAGSYEZDY3WV4LAVCNFSM6AAAAABENHJ4UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZYGA4TKMZZGE. You are receiving this because your review was requested.Message ID: @.***>
Can you share out a trace? … ________________________________ From: Jeff Kluge @.> Sent: Friday, April 5, 2024 6:55 AM To: NuGet/NuGet.Client @.> Cc: David Kean @.>; Review requested @.> Subject: Re: [NuGet/NuGet.Client] Reduce allocations in TokenSegment.TryMatch (PR #5676) @jeffklhttps://github.com/jeffkl Do you have any numbers on this? Before: image.png (view on web)https://github.com/NuGet/NuGet.Client/assets/17556515/e3383573-3d28-40b8-8e27-ca1307d2c65a After: image.png (view on web)https://github.com/NuGet/NuGet.Client/assets/17556515/1a869020-04fb-4ac0-875d-877487766b75 — Reply to this email directly, view it on GitHub<#5676 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAINQIQFJT2BMFOPAGSYEZDY3WV4LAVCNFSM6AAAAABENHJ4UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZYGA4TKMZZGE. You are receiving this because your review was requested.Message ID: @.***>
I compared the traces. The highlighted areas where I believe the above change affects. We moved ~45 MB -> 22 MB, but I want to see this is repeatible (I see differences where I don't expect), so can you run the after a few times and make sure that the TryMatch numbers match below within a MB or so. If not, then we don't have a deterministic repro, and we should pick something larger.
Here are 4 runs of before:
And after:
Here are 4 runs of before:
And after:
Hmm... Something seems off. The metrics in the "after" look worse. Max peak MB from ~1,200 MB => ~1,700 MB, Total Pause from ~252 => ~460.
In the before GC stats, only 1 of the 4 had a gen 2 GC, which only paused for 14ms. All 4 of the after GC stats show a gen 2 GC, and takes around 150+ ms.
I'm not super convinced its testing the exact same thing; I see allocation increases/decreases in the places I expect, but increases in MSBuild code:
Why would MSBuild be doing more work?
Why would MSBuild be doing more work?
@jeffkl could you address some of the pending concerns in this PR?
Team Triage: Based on the screenshots and comments in here, since the before and after have a huge difference in the total allocations, we need to get equivalent before/after traces so we can analyze the allocations of the specific methods better.
I'll help out with this work.
I'll close this PR for now. I'll open a new one if need be.
The perfview traces I collected don't really show the improvement consistently.
Using benchmarkdotnet, I have some benchmarks, where I basically ran through my GPF, running selection against every package with the 2 implementations.
The total allocations went from 282.41 MB to 253.44 MB (Assuming no LOH which would be missed by benchmark dotnet) Microbenchmarks on single packages aren't as promising as this.
- I'll try to minimize the noise in the full restore benchmarks, see if we can see anything productive there.
- Explore some of the ideas I have based on the feedback here and some other things I saw while digging into these changes.
This PR:
Name Inc % Inc
nuget.packaging!NuGet.ContentModel.Infrastructure.PatternExpression+TokenSegment.TryMatch(class NuGet.ContentModel.ContentItem&,class System.String,class System.Collections.Generic.IReadOnlyDictionary`2,int32,int32&) 1.1 99,485,616
+ mscorlib.ni!System.Collections.Generic.Dictionary`2[System.__Canon,System.__Canon].Insert(System.__Canon, System.__Canon, Boolean) 0.4 37,982,336
+ nuget.packaging!ContentPropertyDefinition.TryLookup 0.3 22,924,920
+ nuget.packaging!NuGet.ContentModel.ContentItem.get_Properties() 0.2 16,821,344
+ system.memory.ni!? 0.1 11,678,544
+ clr!JIT_New 0.1 10,078,472
Name Inc % Inc
nuget.packaging!NuGet.ContentModel.Infrastructure.PatternExpression+TokenSegment.TryMatch(class NuGet.ContentModel.ContentItem&,class System.String,class System.Collections.Generic.IReadOnlyDictionary`2,int32,int32&) 1.1 100,373,768
+ mscorlib.ni!System.Collections.Generic.Dictionary`2[System.__Canon,System.__Canon].Insert(System.__Canon, System.__Canon, Boolean) 0.4 35,133,768
+ nuget.packaging!NuGet.ContentModel.ContentItem.get_Properties() 0.3 24,102,992
+ nuget.packaging!ContentPropertyDefinition.TryLookup 0.2 21,077,912
+ system.memory.ni!? 0.1 10,791,424
+ clr!JIT_New 0.1 9,267,672
Name Inc % Inc
nuget.packaging!NuGet.ContentModel.Infrastructure.PatternExpression+TokenSegment.TryMatch(class NuGet.ContentModel.ContentItem&,class System.String,class System.Collections.Generic.IReadOnlyDictionary`2,int32,int32&) 1.0 93,417,816
+ mscorlib.ni!System.Collections.Generic.Dictionary`2[System.__Canon,System.__Canon].Insert(System.__Canon, System.__Canon, Boolean) 0.4 32,911,760
+ nuget.packaging!NuGet.ContentModel.ContentItem.get_Properties() 0.2 20,449,768
+ nuget.packaging!ContentPropertyDefinition.TryLookup 0.2 19,134,128
+ system.memory.ni!? 0.1 11,734,512
+ clr!JIT_New 0.1 9,187,648
17.11 P3
Name Inc % Inc
nuget.packaging!NuGet.ContentModel.Infrastructure.PatternExpression+TokenSegment.TryMatch(class NuGet.ContentModel.ContentItem&,class System.String,class System.Collections.Generic.IReadOnlyDictionary`2,int32,int32&) 1.3 110,683,792
+ mscorlib.ni!System.String.Substring(Int32, Int32) 0.4 37,942,624
+ mscorlib.ni!System.Collections.Generic.Dictionary`2[System.__Canon,System.__Canon].Insert(System.__Canon, System.__Canon, Boolean) 0.4 35,833,368
+ clr!JIT_New 0.4 32,210,392
+ nuget.packaging!ContentPropertyDefinition.TryLookup 0.1 4,697,408
Name Inc % Inc
nuget.packaging!NuGet.ContentModel.Infrastructure.PatternExpression+TokenSegment.TryMatch(class NuGet.ContentModel.ContentItem&,class System.String,class System.Collections.Generic.IReadOnlyDictionary`2,int32,int32&) 1.1 100,833,848
+ mscorlib.ni!System.String.Substring(Int32, Int32) 0.4 35,158,928
+ mscorlib.ni!System.Collections.Generic.Dictionary`2[System.__Canon,System.__Canon].Insert(System.__Canon, System.__Canon, Boolean) 0.4 33,438,144
+ clr!JIT_New 0.3 25,918,536
+ nuget.packaging!ContentPropertyDefinition.TryLookup 0.1 6,318,240
Name Inc % Inc
nuget.packaging!NuGet.ContentModel.Infrastructure.PatternExpression+TokenSegment.TryMatch(class NuGet.ContentModel.ContentItem&,class System.String,class System.Collections.Generic.IReadOnlyDictionary`2,int32,int32&) 1.3 111,532,960
+ mscorlib.ni!System.String.Substring(Int32, Int32) 0.5 40,336,008
+ mscorlib.ni!System.Collections.Generic.Dictionary`2[System.__Canon,System.__Canon].Insert(System.__Canon, System.__Canon, Boolean) 0.4 34,488,920
+ clr!JIT_New 0.3 30,443,528
+ nuget.packaging!ContentPropertyDefinition.TryLookup 0.1 6,264,504
3 runs each, on average, it's about 10MB for OrchardCore. 1/3 runs were a bit noisy and showing similar results.
Dictionary is the same amount of allocations, no substring allocations. TryLookup is worse because of the ToString actualizations. A bit of a trade-off.
Apparently force-pushing a branch prevents me from reopening the PR.
Please review in https://github.com/NuGet/NuGet.Client/pull/5933.
All feedback here has been addressed. There's a lot, so do let me know if I missed something.