SOLR-16470: Create V2 equivalent of V1 Replication: Get files/{filePath}
https://issues.apache.org/jira/browse/SOLR-16470
Description
Create a V2 equivalent API for the replication "GET files/{filePath}"
V1
/solr/{coreName}/replication?command=filecontent&file=_0_Lucene99_0.tmd&wt=filestream
V2
/api/cores/{coreName}/replication/files/{filePath}?dirType=file
Other sample requests
/api/cores/{coreName}/replication/files/./tlog.0000000000000000000?dirType=tlogFile
/api/cores/{coreName}/replication/files/./solrconfig.xml?dirType=cf
Solution
Moved getFileStream() logic into ReplicationAPIBase class and updated the V1 endpoint to use the CoreReplicationAPI class so that the ReplicationHandler can be deprecated in the future.
For the V2 api, made dirType a required parameters specifying what file directory (cf/tlogFile/file) to fetch a file from and better expressed the optional parameters that exist for the api
offsetcompressionchecksummaxWriteMBPerSecgeneration
Tests
Wrote new unit test in CoreReplicationAPITest class testFetchFile()
Checklist
Please review the following and check all that apply:
- [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
- [x] I have created a Jira issue and added the issue ID to my pull request title.
- [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
- [x] I have developed this patch against the
mainbranch. - [x] I have run
./gradlew check. - [x] I have added tests for my changes.
- [x] I have added documentation for the Reference Guide
Hey @gerlowskija, should be the last API for SOLR-16470 whenever you get time to review. Thanks!
Awesome - thanks @mlbiscoc . Sorry for the delay getting to this - took me awhile to catch back up after the travel last week. Reviewing now...
Hey @gerlowskija made the changes but ended up kind of inflating this PR to support the OpenAPI code generation and moving files. Lmk if this went the right direction
Hey @gerlowskija made the changes but ended up kind of inflating this PR to support the OpenAPI code generation and moving files
Awesome! I see a lot of the increase comes from moving some of the other replication APIs ("list files", "get index version") from core over to the "api" module -- I've actually got another PR that takes a stab at some of those changes as well, so if we rebase one on top of the other, that should shrink this PR back down a good bit. Will take a crack at that shortly, and check back in when that's done!
Very curious about the "StreamingOutput" type you found. I've used InputStream in a few similar cases with other APIs (e.g. there's a "filestore" API that returns raw files, similar to this /replication one), though maybe StreamingOutput is better? Will have to do some reading...
Alright, merged in the latest 'main' and synced it with the changes here. The only real change of note is that the interface file is now named 'ReplicationApis.java' (I went with the name on 'main' since it was already there. The merge conflicts were a bit hairy, so if anything other than that seems "off", it's likely a mistake so please point it out 😆
Hoping to dig into the StreamingOutput thing shortly - thanks @mlbiscoc !
Alright, merged in the latest 'main' and synced it with the changes here. The only real change of note is that the interface file is now named 'ReplicationApis.java' (I went with the name on 'main' since it was already there. The merge conflicts were a bit hairy, so if anything other than that seems "off", it's likely a mistake so please point it out 😆
Hoping to dig into the StreamingOutput thing shortly - thanks @mlbiscoc !
Great, thanks Jason. Will go back through it and take a look at the merge soon. Also need to look that precommit failure…
After a bit of reading, I agree with Matt that StreamingOutput seems to be the "preferred" approach.
The best explanation I could find on "why" comes from this SO post. To summarize: StreamingOutput is easier for various JAX-RS supported interceptor hooks (MessageBodyWriter, ResponseFilter, etc.). The example given is an interceptor that implements gzip-encoding - apparently that's tough to do on an OutputStream, but easier in some way for JAX-RS to handle when wrapped as a 'StreamingOutput'.
That said - supporting this type in our Java code-generation might take a bit more effort than we want to bite off here and now. The API should be fine as far as Javascript and Python clients are concerned, but the custom template we're using for Java makes that one a bit trickier.
I'll exempt fetchFile from Java code generation (we have an annotation for that purpose), and create a followup JIRA ticket around standardizing on and supporting StreamingOutput.
(To close the loop, I created SOLR-17562 to cover the codegen StreamingOutput support)
Not sure what's up with the precommit and tests GHA's, but they both pass locally for me. Will take a look and fix if I can.
Other than that (and CHANGES.txt, which I'll do right before merge) - is there anything else that needs done here @mlbiscoc ?
Not seeing documentation updates?
Not seeing documentation updates?
Ah, good thought. Yeah, modules/deployment-guide/pages/user-managed-index-replication.adoc should probably be updated to mention the v2 version of the API.
(If you haven't done a ref-guide update yet - we tend to use a tabbed display to group v1 and v2 examples of a particular API together. There are lots of sections in the ref-guide that could serve as an example here, but I'll pick "Create Collection" as an example. For create-collection the asciidoc snippet here gets parsed and displayed into the nice looking "tabbed" display here.)
Other than the docs though, this LGTM and I'll aim to merge when they get added.
Added in some documentation to the ref-guide for the filecontent api and it's parameters. Also made a small change to fix precommit. Went back through the merge and rebase and nothing sticks out to me that looks out of the ordinary.