dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

Migration HttpSolrClient to Http2SolrClient and ConcurrentUpdateHttp2SolrClient

Open jeromeroucou opened this issue 1 year ago • 7 comments

What this PR does / why we need it:

This PR replace deprecated HttpSolrClient with 2 new implementations :

  • Http2SolrClient to make queries to Solr
  • ConcurrentUpdateHttp2SolrClient to perform index to Solr

Which issue(s) this PR closes:

Closes #10161

Special notes for your reviewer:

I've been careful to centralize code into an abstract class, and delete local client on IndexServiceBean to use instead the 2 new service client (one for queries, one for indexation)

Suggestions on how to test this:

I run some local tests on my laptop : create collection, publish dataset with files, search on index page, harvesting other repository, call clear-orphan API. But I didn't index a large collection to measure if performance had improved.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

N/A

Is there a release notes update needed for this change?:

If indexation performance is increase, I think it would be nice to mention it.

Additional documentation:

Configuration documentation has been updated with the new feature-flag to enable it.

jeromeroucou avatar Jan 18 '24 10:01 jeromeroucou

Coverage Status

coverage: 22.754% (+0.002%) from 22.752% when pulling 7cae923c2d9143006ad8f39ceb0329644cd73643 on Recherche-Data-Gouv:10161-Http2SolrClient into 5073daa79241fb6602a6f039df37e570b71fc59b on IQSS:develop.

coveralls avatar Jan 18 '24 10:01 coveralls

@jeromeroucou - thanks for the PR! A ~quick question - have you run this extensively?

I ask because I tried to make a simple switch from HttpSolrClient to Http2SolrClient in the SolrClientService class at QDR and started seeing errors w.r.t. the client being closed that started after some delay (~days). It's possible that there were restarts of solr going on at QDR so I just reverted and was going to check again at some point after the solr changes were done, but I haven't yet.

If the new Http2SolrClient does somehow timeout (or perhaps it is just worse at handling a solr restart?), we'd want to address that (e.g. with reinitialize). If you haven't tried running this branch for days, I may try again at QDR and/or will make sure that we test a longer duration in QA.

qqmyers avatar Jan 18 '24 19:01 qqmyers

Hi @qqmyers, thanks for this feedback.

No, I don't run Http2SolrClient some days, but I'll try to do it on my side too. I'll indicate the result.

jeromeroucou avatar Jan 19 '24 08:01 jeromeroucou

@jeromeroucou hi! Any results to share? Thanks!

pdurbin avatar Feb 28 '24 18:02 pdurbin

Unfortunately no, our deployment environment is still broken. We're still working on restoring it. I hope to be able to give you an update within 2 weeks.

jeromeroucou avatar Feb 29 '24 08:02 jeromeroucou

General remark on the PR: the HTTP2 classes are still flagged experimental, see https://solr.apache.org/guide/solr/latest/deployment-guide/solrj.html#types-of-solrclients. Do we really want to do this? Should we potentially use a feature-flag, so it keeps the experimental nature also in Dataverse?

poikilotherm avatar Feb 29 '24 08:02 poikilotherm

Hi @poikilotherm,

I think the documentation you're referring to specifies that the client is in progress (I understand that it's not out of the question for a method signature to be modified for exemple). However, Solr Javadoc clearly indicates HttpSolrClient is deprecated and Http2SolrClient should be used instead.

But your proposition to use a feature-flag may be a good thing until HttpSolrClient is really removed.

jeromeroucou avatar Feb 29 '24 10:02 jeromeroucou

The PR is finally ready for review again!

I've taken @poikilotherm feedback into account by using a feature flag.

Regarding @qqmyers comments, we deployed this branch in an environment that remained switched on for over a week, without observing any connection interruption. However, not really knowing how to check, we may not have noticed the errors?

jeromeroucou avatar Oct 30 '24 08:10 jeromeroucou

Thanks @jeromeroucou.

I had mischaracterized this PR as a feature but it's really more of a bug fix (stopping the use of deprecated code) so the team will talk about it at our weekly review of "ready for triage" on Tuesday.

pdurbin avatar Oct 31 '24 21:10 pdurbin

Since it has been determined that this is a bug and not a feature I believe the Feature Flag needs to be removed. @pdurbin @poikilotherm @qqmyers Please feel free to give your feedback.

stevenwinship avatar Nov 13 '24 16:11 stevenwinship

@stevenwinship sure, makes sense.

pdurbin avatar Nov 13 '24 19:11 pdurbin

Feature flag is now removed

jeromeroucou avatar Nov 15 '24 17:11 jeromeroucou

Hello, can we please bump the version to 6.5 in the pom.xml

ofahimIQSS avatar Jan 15 '25 20:01 ofahimIQSS

Hello @ofahimIQSS , I've just merge develop branch

jeromeroucou avatar Jan 22 '25 09:01 jeromeroucou

Testing Passed - I indexed large collections on internal, no issues found. Merging.

ofahimIQSS avatar Jan 22 '25 17:01 ofahimIQSS