solr icon indicating copy to clipboard operation
solr copied to clipboard

SOLR-16470: Create V2 equivalent of V1 Replication: Get files/{filePath}

Open mlbiscoc opened this issue 1 year ago • 2 comments

https://issues.apache.org/jira/browse/SOLR-16470

Description

Create a V2 equivalent API for the replication "GET files/{filePath}"

V1 /solr/{coreName}/replication?command=filecontent&file=_0_Lucene99_0.tmd&wt=filestream

V2 /api/cores/{coreName}/replication/files/{filePath}?dirType=file

Other sample requests

/api/cores/{coreName}/replication/files/./tlog.0000000000000000000?dirType=tlogFile /api/cores/{coreName}/replication/files/./solrconfig.xml?dirType=cf

Solution

Moved getFileStream() logic into ReplicationAPIBase class and updated the V1 endpoint to use the CoreReplicationAPI class so that the ReplicationHandler can be deprecated in the future.

For the V2 api, made dirType a required parameters specifying what file directory (cf/tlogFile/file) to fetch a file from and better expressed the optional parameters that exist for the api

  • offset
  • compression
  • checksum
  • maxWriteMBPerSec
  • generation

Tests

Wrote new unit test in CoreReplicationAPITest class testFetchFile()

Checklist

Please review the following and check all that apply:

  • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • [x] I have created a Jira issue and added the issue ID to my pull request title.
  • [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • [x] I have developed this patch against the main branch.
  • [x] I have run ./gradlew check.
  • [x] I have added tests for my changes.
  • [x] I have added documentation for the Reference Guide

mlbiscoc avatar Sep 30 '24 20:09 mlbiscoc

Hey @gerlowskija, should be the last API for SOLR-16470 whenever you get time to review. Thanks!

mlbiscoc avatar Sep 30 '24 20:09 mlbiscoc

Awesome - thanks @mlbiscoc . Sorry for the delay getting to this - took me awhile to catch back up after the travel last week. Reviewing now...

gerlowskija avatar Oct 17 '24 16:10 gerlowskija

Hey @gerlowskija made the changes but ended up kind of inflating this PR to support the OpenAPI code generation and moving files. Lmk if this went the right direction

mlbiscoc avatar Nov 01 '24 15:11 mlbiscoc

Hey @gerlowskija made the changes but ended up kind of inflating this PR to support the OpenAPI code generation and moving files

Awesome! I see a lot of the increase comes from moving some of the other replication APIs ("list files", "get index version") from core over to the "api" module -- I've actually got another PR that takes a stab at some of those changes as well, so if we rebase one on top of the other, that should shrink this PR back down a good bit. Will take a crack at that shortly, and check back in when that's done!

Very curious about the "StreamingOutput" type you found. I've used InputStream in a few similar cases with other APIs (e.g. there's a "filestore" API that returns raw files, similar to this /replication one), though maybe StreamingOutput is better? Will have to do some reading...

gerlowskija avatar Nov 11 '24 13:11 gerlowskija

Alright, merged in the latest 'main' and synced it with the changes here. The only real change of note is that the interface file is now named 'ReplicationApis.java' (I went with the name on 'main' since it was already there. The merge conflicts were a bit hairy, so if anything other than that seems "off", it's likely a mistake so please point it out 😆

Hoping to dig into the StreamingOutput thing shortly - thanks @mlbiscoc !

gerlowskija avatar Nov 13 '24 21:11 gerlowskija

Alright, merged in the latest 'main' and synced it with the changes here. The only real change of note is that the interface file is now named 'ReplicationApis.java' (I went with the name on 'main' since it was already there. The merge conflicts were a bit hairy, so if anything other than that seems "off", it's likely a mistake so please point it out 😆

Hoping to dig into the StreamingOutput thing shortly - thanks @mlbiscoc !

Great, thanks Jason. Will go back through it and take a look at the merge soon. Also need to look that precommit failure…

mlbiscoc avatar Nov 14 '24 14:11 mlbiscoc

After a bit of reading, I agree with Matt that StreamingOutput seems to be the "preferred" approach.

The best explanation I could find on "why" comes from this SO post. To summarize: StreamingOutput is easier for various JAX-RS supported interceptor hooks (MessageBodyWriter, ResponseFilter, etc.). The example given is an interceptor that implements gzip-encoding - apparently that's tough to do on an OutputStream, but easier in some way for JAX-RS to handle when wrapped as a 'StreamingOutput'.

That said - supporting this type in our Java code-generation might take a bit more effort than we want to bite off here and now. The API should be fine as far as Javascript and Python clients are concerned, but the custom template we're using for Java makes that one a bit trickier.

I'll exempt fetchFile from Java code generation (we have an annotation for that purpose), and create a followup JIRA ticket around standardizing on and supporting StreamingOutput.

gerlowskija avatar Nov 14 '24 16:11 gerlowskija

(To close the loop, I created SOLR-17562 to cover the codegen StreamingOutput support)

Not sure what's up with the precommit and tests GHA's, but they both pass locally for me. Will take a look and fix if I can.

Other than that (and CHANGES.txt, which I'll do right before merge) - is there anything else that needs done here @mlbiscoc ?

gerlowskija avatar Nov 15 '24 14:11 gerlowskija

Not seeing documentation updates?

fsparv avatar Nov 15 '24 15:11 fsparv

Not seeing documentation updates?

Ah, good thought. Yeah, modules/deployment-guide/pages/user-managed-index-replication.adoc should probably be updated to mention the v2 version of the API.

(If you haven't done a ref-guide update yet - we tend to use a tabbed display to group v1 and v2 examples of a particular API together. There are lots of sections in the ref-guide that could serve as an example here, but I'll pick "Create Collection" as an example. For create-collection the asciidoc snippet here gets parsed and displayed into the nice looking "tabbed" display here.)

Other than the docs though, this LGTM and I'll aim to merge when they get added.

gerlowskija avatar Nov 18 '24 14:11 gerlowskija

Added in some documentation to the ref-guide for the filecontent api and it's parameters. Also made a small change to fix precommit. Went back through the merge and rebase and nothing sticks out to me that looks out of the ordinary.

mlbiscoc avatar Nov 18 '24 17:11 mlbiscoc