NuGetPackageExplorer
NuGetPackageExplorer copied to clipboard
Performance is low when reading a large JSON feed
When reading a large JSON feed
Steps:

-
load from https://dotnetfeed.blob.core.windows.net/dotnet-core/index.json
-
Loaded:

Current result
Load takes ~16 seconds
Details
The expensive calls are:
var json = await _rawPackageSearchResouce.Search(searchText, _searchContext.Filter, CurrentPage * _pageSize, _pageSize, NullLogger.Instance, token);
json.Select(s => s.FromJToken<PackageSearchMetadata>()).ToList();
inside ShowLatestVersionQueryContext<T>
Notes:
rawPackageSearchResouce.Searchis parsing to JObjectFromJTokenis just aJToken.ToObject<T>(JsonSerializer jsonSerializer)from JSON.NET- The FromJToken is called for each package separate, maybe deserialising the whole content in one step is more efficient.
- full URL of the JSON feed: https://dotnetfeed.blob.core.windows.net/dotnet-core/search/query?q=&skip=0&take=15&prerelease=false&semVerLevel=2.0.0 (141 MB, raw, without gzip etc)
Just want to note that the current logic is similar to the one in Nuget.Client but without the total count per version "fix": https://github.com/NuGet/NuGet.Client/blob/e9c22b1c5783edefcd9c5175dc76f99206fa14c8/src/NuGet.Core/NuGet.Protocol/Resources/PackageSearchResourceV3.cs#L28-L32
Thanks!
I think both aren't build for (large) static feeds, isn't?
This is maybe a poor mans test, but looks good in terms of performance:

The CurrentApproachTest is a stripped version of the code from NuGet Client
public class PerformanceJsonTests
{
private string _resourceName = "UnitTestProject1.dotnetfeed.blob.core.windows.net.json";
[Fact]
public async void CurrentApproachTest()
{
var token = new CancellationToken();
using var stream = GetEmbeddedSource(_resourceName);
// Act
var results = await stream.AsJObjectAsync(token);
var data = results[JsonProperties.Data] as JArray ?? Enumerable.Empty<JToken>();
var json = data.OfType<JObject>();
var packages = json.Select(s => s.FromJToken<PackageSearchMetadata>()).ToList();
// Assert
AssertPackages(packages);
}
[Fact]
public void JsonNetTest()
{
using var stream = GetEmbeddedSource(_resourceName);
// Act
var packages = DeserializeFromStream<FullPackageSearchMetadata>(stream);
// Assert
AssertPackages(packages.Data);
}
private class FullPackageSearchMetadata
{
public List<PackageSearchMetadata> Data { get; set; }
}
private static T DeserializeFromStream<T>(Stream s)
{
using (StreamReader reader = new StreamReader(s))
using (JsonTextReader jsonReader = new JsonTextReader(reader))
{
JsonSerializer ser = JsonExtensions.JsonObjectSerializer;
return ser.Deserialize<T>(jsonReader);
}
}
private static void AssertPackages(List<PackageSearchMetadata> packages)
{
Assert.Equal(1904, packages.Count);
var package = packages.First();
Assert.Equal("3.0.0-alpha-26807-18", package.ParsedVersions.First().Version.OriginalVersion);
Assert.Equal("Accessibility", package.Identity.Id);
}
private static Stream GetEmbeddedSource(string resoucename)
{
var assembly = Assembly.GetExecutingAssembly();
var stream = assembly.GetManifestResourceStream(resoucename);
if (stream == null)
{
throw new Exception($"resource {resoucename} not found");
}
return stream;
}
}
full test code here: https://github.com/304NotModified/NuGetPackageExplorer/tree/static-feed-json-parse-performance/UnitTestProject1
results from benchmarkdotnet:
BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17134.885 (1803/April2018Update/Redstone4)
Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
Frequency=2156252 Hz, Resolution=463.7677 ns, Timer=TSC
.NET Core SDK=3.0.100-preview7-012821
[Host] : .NET Core 3.0.0-preview7-27912-14 (CoreCLR 4.700.19.32702, CoreFX 4.700.19.36209), 64bit RyuJIT
Job-JKYBCG : .NET Core 3.0.0-preview7-27912-14 (CoreCLR 4.700.19.32702, CoreFX 4.700.19.36209), 64bit RyuJIT
Core : .NET Core 3.0.0-preview7-27912-14 (CoreCLR 4.700.19.32702, CoreFX 4.700.19.36209), 64bit RyuJIT
Runtime=Core InvocationCount=1 UnrollFactor=1
| Method | Job | IterationCount | LaunchCount | RunStrategy | WarmupCount | Feed | Mean | Error | StdDev | Rank |
|---|---|---|---|---|---|---|---|---|---|---|
| NewApproach | Default | 5 | 1 | Monitoring | 1 | Dotnetfeed | 3,024.743 ms | 40.0995 ms | 10.4137 ms | 3 |
| CurrentApproach | Default | 5 | 1 | Monitoring | 1 | Dotnetfeed | 7,153.443 ms | 430.3471 ms | 111.7598 ms | 4 |
| NewApproach | Core | Default | Default | Default | Default | Dotnetfeed | 3,031.691 ms | 61.7188 ms | 131.5278 ms | 3 |
| CurrentApproach | Core | Default | Default | Default | Default | Dotnetfeed | 7,498.749 ms | 149.6982 ms | 189.3203 ms | 5 |
| NewApproach | Default | 5 | 1 | Monitoring | 1 | Nuget | 3.136 ms | 0.7524 ms | 0.1954 ms | 1 |
| CurrentApproach | Default | 5 | 1 | Monitoring | 1 | Nuget | 4.823 ms | 0.3609 ms | 0.0937 ms | 2 |
| NewApproach | Core | Default | Default | Default | Default | Nuget | 2.994 ms | 0.0582 ms | 0.0544 ms | 1 |
| CurrentApproach | Core | Default | Default | Default | Default | Nuget | 4.858 ms | 0.0946 ms | 0.1263 ms | 2 |
Benchmark with memory usage:

Is the NewApproach this one https://github.com/304NotModified/NuGetPackageExplorer/blob/a6e936f827433dd41e9546cba74a06bcc4719a68/UnitTestProject1/PerformanceJsonTests.cs#L21-L31 ?
How would we integrate the NewApproach because we currently using the RawSearchResourceV3 which returns an IEnumerable<JObject> and not the raw stream?
Yes newApproach == JsonNetTest()
About the integration, or send a PR to nuget (client?) or fork the relevant classes.
https://github.com/NuGet/NuGet.Client/pull/3406 got merged which should improve memory usage and performance for static feeds.
I looked into the fix in more detail and it will limit static feeds to only return the first items - specified by take. I think it is because static feeds can't be distinguished from dynamic server feeds.
https://github.com/NuGet/NuGet.Client/blob/427adf89c1fa3aab03f4f3840982f2d6b030d3e3/src/NuGet.Core/NuGet.Protocol/Resources/PackageSearchResourceV3.cs#L235-L243
So in our case we would only show the first 15 results (in a few seconds) and when scrolling down it would load the same 15 results again - indefinitely. As soon as we remove our RawSearchResourceV3 workaround.
@campersau can you please file a bug on the NuGet repo describing this limitation? Seems like there's still a gap that needs to be fixed.
It looks like static feeds will not be supported by NuGet.Client (see https://github.com/NuGet/Home/issues/9726#issuecomment-654581621).
So I think we have two options now:
- Try to support static feeds on our own (as long as
RawSearchResourceV3is still there)- Copying and adjusting some code from https://github.com/NuGet/NuGet.Client/pull/3406
- Drop support for static feeds
Isn't static feeds one of the main benefits of NPE over nuget.org/another package website/Azure Devops?