hyrax icon indicating copy to clipboard operation
hyrax copied to clipboard

ResourceSync resource lists should scale to > 50K items

Open mjgiarlo opened this issue 9 years ago • 5 comments

Decomposed from projecthydra/sufia#2915.

Hyrax defines a simple ResourceSync implementation that provides an API for harvesting deposited works via sitemaps and content negotiation. It currently scales up to a maximum of 50K items (the upper bound for a sitemap), but it should scale beyond 50K items.

mjgiarlo avatar Dec 19 '16 18:12 mjgiarlo

@mjgiarlo: As written, this is more of a requirement than an implementable ticket. Please specify what implementation direction should be pursued.

atz avatar Apr 07 '17 23:04 atz

@atz part of the sitemap spec says, "thou shalt not have more than 50k items in a sitemap". We are violating that. We need to partition it somehow (resource sync provides a mechanism for multiple resource lists), but we haven't decided on how to partition yet.

jcoyne avatar Apr 08 '17 01:04 jcoyne

@atz @mjgiarlo @jcoyne We are seeing the effect of not scaling beyond 50k. Our repo now has more then 100k and using the changelist is putting a lot of stress on our servers.

I would appreciate any relevant information, and or the direction to process to address this issue.

tahirpoduska avatar Apr 12 '19 13:04 tahirpoduska

Redirecting :point_up: to folks who are actively working on Hyrax: @no-reply @vantuyls @samvera/hyrax-code-reviewers

mjgiarlo avatar Apr 12 '19 14:04 mjgiarlo

I guess we had better slate this for work in the 3.x series.

no-reply avatar Sep 20 '19 23:09 no-reply