NuGetGallery icon indicating copy to clipboard operation
NuGetGallery copied to clipboard

[NuGet.org Bug]: download counts are inconsistent between Gallery and Search service

Open drewgillies opened this issue 1 year ago • 5 comments

Impact

Other

Describe the bug

Statistics on the search page don't match statistics on the package details page.

Repro Steps

Search for Fabulous Scheduler on NuGet.org and view the results: https://www.nuget.org/packages?q=fabulous+scheduler Click on the FabulousScheduler link on the results page and view the stats on the package details page: https://www.nuget.org/packages/FabulousScheduler

They don't match.

Expected Behavior

Both screen show the same download count.

Screenshots

Search results: image

Package details: image

Additional Context and logs

No response

drewgillies avatar Jan 31 '24 03:01 drewgillies

Possibly related, the primary query API has been serving stale download counts for a few weeks, but the secondary endpoint serves higher download counts.

e.g., Compare totalDownloads of ussc vs usnc. According to my records, the last time the primary endpoint updated download counts was Jan 19, 2024 (22 days ago), but the secondary one seems to be working as expected.

A side effect of this issue is that NuGet trends graphs have leveled-out for the last few weeks. E.g., https://nugettrends.com/packages?months=6&ids=NUnit

image

I hope this information is helpful. Thanks NuGet team for all you do! 🚀

swharden avatar Feb 10 '24 19:02 swharden

If you independently check download counts on the search API vs. the gallery, you will often see a different number. This is by design because there is no shared, live cache that gallery and search services depend on (and we don't want the headache of the SPOF).

But the problem related to rendering in the gallery, for example search results show in gallery vs. the package details page could be resolved by replacing the search API download count with what the gallery knows via its own cache. This would at least make gallery self-consistent.

A related issue, and more of a bug is https://github.com/NuGet/NuGetGallery/issues/9928 which concerns gallery self-consistency.

joelverhagen avatar Apr 23 '24 18:04 joelverhagen

But the problem related to rendering in the gallery, for example search results show in gallery vs. the package details page could be resolved by replacing the search API download count with what the gallery knows via its own cache. This would at least make gallery self-consistent.

True but, and I haven't looked at the code yet so I may be wrong about this, I smell at least a potential "SELECT N+1" problem here...

Will look into that as soon as possible.

jodydonetti avatar May 01 '24 10:05 jodydonetti

True but, and I haven't looked at the code yet so I may be wrong about this, I smell at least a potential "SELECT N+1" problem here...

Yes, depending on the implementation of the cache you're totally right. Today the cache is a giant in-memory dictionary where point reads are "free". So doing 20-30 download count reads as a fix-up when rendering the search page could work. But a more traditional read-through would have that problem.

joelverhagen avatar May 01 '24 13:05 joelverhagen

Ah, good call! I'm thinking about something right now, will update.

jodydonetti avatar May 01 '24 14:05 jodydonetti