solr icon indicating copy to clipboard operation
solr copied to clipboard

SOLR-8393: Component for resource usage planning

Open igiguere opened this issue 2 years ago • 6 comments

https://issues.apache.org/jira/browse/SOLR-8393

DRAFT :

  • V2 Implematation ?
  • Question (from @gerlowskija) : Are the calculations based on size-estimator-lucene-solr.xls accurate enough to use?
  • Suggestion (from @dsmiley) : Has the Metrics API been explored as a solution to the problem/need?

Description

New feature that attempts to extrapolate resources needed in the future by looking at resources currently used.

Original idea by Steve Molloy, with additional parameter based on comment from Shawn Heisey.

Documentation copied from the Jira ticket.

Solution

V1 API: New component: SizeComponent.java. Component can be set on the /select handler to provide sizing for a single core.

New collection operation: ClusterSizing.java. Action 'clustersizing' is added to CollectionsHandler. Class ClusterSizing calls the size component for each core.

Old-style V2 API: Adding a method in ClusterApi.java. It calls the V1 implementation, so the question of accuracy remains

Tests

The size component is tested in SizeComponentTest.java

Cluster sizing response is tested in ClusterSizingTest.java

Full test on a running instance of Solr.

Checklist

Please review the following and check all that apply:

  • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • [x] I have created a Jira issue and added the issue ID to my pull request title.
  • [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • [x] I have developed this patch against the main branch.
  • [x] I have run ./gradlew check.
  • [x] I have added tests for my changes.
  • [x] I have added documentation for the Reference Guide

igiguere avatar May 10 '23 18:05 igiguere

I'd prefer to see a V2 api added instead of a V1 api. Adding more V1 api's is just adding to the backlog of work on our V2 migration, so I'd love to see that instead added...!

epugh avatar Apr 09 '24 12:04 epugh

The use of the hphenated (kebab style?) total-disk-size pattern I think should be changed to camelCase totalDiskSize, that is the pattern we use in the rest of our JSON output.

epugh avatar Apr 09 '24 12:04 epugh

This looks very helpful, though I can't speak to if it's accurate or not... I'd love to see somethign replace the old excel spreadsheet that we recently removed as it was no longer useful/accurate. Maybe @janhoy you have some thoughts on this....

epugh avatar Apr 09 '24 12:04 epugh

I'd prefer to see a V2 api added instead of a V1 api. Adding more V1 api's is just adding to the backlog of work on our V2 migration, so I'd love to see that instead added...!

Agreed, but, as mentioned, this is from a pre-existing patch. I come back to Solr only about once a year, and usually to apply some old patch on a more recent version. That means I have a limited understanding of what ties into what and why. So, implementing everything needed for clustersizing v2 would be a long and difficult process for me.

Participation is welcomed!

igiguere avatar Apr 10 '24 20:04 igiguere

About to take a look at the code and see if I can help with the v2 side of things, but before I dive into that I figured it was worth asking:

Does size-estimator-lucene-solr.xls actually work for folks? Do you use it regularly @igiguere ? Have you found it to be pretty accurate? Any other folks have experience with it?

I'm happy to be wrong if we have several groups of folks out there in the wild that are using it, but my initial reaction is to be a little skeptical that it's reliable enough to incorporate into Solr.

Primarily because, well, modeling resource-usage is a really really tough problem. There's a reason that the community's only response to sizing questions has always been pretty much "You'll have to Guess-and-Check".

And secondarily, because the spreadsheet this is all based off of was added in 2011 and hasn't really seen much iteration in the decade since. There's an absolute ton that's changed in both Lucene and Solr since then.

gerlowskija avatar Apr 11 '24 17:04 gerlowskija

@gerlowskija

Does size-estimator-lucene-solr.xls actually work for folks? Do you use it regularly @igiguere ? Have you found it to be pretty accurate? Any other folks have experience with it?

Me, personally, no, I don't use it ;). I'll try to find out from client-facing people in the company. I doubt anyone has compiled usage vs success statistics.

UPDATE: I couldn't find anyone who really used size-estimator-lucene-solr.xls or the clusterzising feature (v1). So of course, nobody has any clue about accuracy.

... the community's only response to sizing questions has always been pretty much "You'll have to Guess-and-Check".

The cluster sizing feature is documented to estimate (i.e.: guess) resource usage. We could make the documentation clearer that it's not a fool-proof measure. But, at least it beats holding a finger to the wind. And it's a bit less complicated that the xls and a calculator.

And secondarily, because the spreadsheet this is all based off of was added in 2011 and hasn't really seen much iteration in the decade since. There's an absolute ton that's changed in both Lucene and Solr since then.

We're only calculating RAM, disk size, document size. Whatever has changed in Solr and Lucene, if it has an effect on RAM, disk space, doc size, then it should be reflected on the results... No?

Note that this feature is meant to be used on a current "staging" deployment, to evaluate the eventual size of a "production" environment, for the same version of Solr. No one is expected to draw conclusions from a previous version, so changes from one version to another are not a concern in that way.

As a more general note, I should add that I'm a linguist converted to Java dev. Not a mathematician ;) If there's an error in the math, I will never see it.

igiguere avatar Apr 11 '24 20:04 igiguere