solr
solr copied to clipboard
SOLR-16824: Back port to branch 9x
https://issues.apache.org/jira/browse/SOLR-16824
Description
Backport to branch_9x the code in https://github.com/apache/solr/commit/0df3e63ffbdcde0eb953876a3fb47994d334d8f2. The changes are backwards compatible, however we need to remove the use of OPTION_CREDENTIALS from all the commands.
Solution
Please provide a short description of the approach taken to implement your solution.
Tests
Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.
Checklist
Please review the following and check all that apply:
- [ ] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
- [ ] I have created a Jira issue and added the issue ID to my pull request title.
- [ ] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
- [ ] I have developed this patch against the
mainbranch. - [ ] I have run
./gradlew check. - [ ] I have added tests for my changes.
- [ ] I have added documentation for the Reference Guide
Humm.. I expected Crave and all the script tests to run on github.....
I think you used the wrong JIRA in the title and description. Could you add some content to the title in addition to the backport?
I think you used the wrong JIRA in the title and description. Could you add some content to the title in addition to the backport?
Argh, that is what I get for not copy-pasting the JIRA ticket number!
Okay, this is starting to turn into a bear... I'm working through dealing with the fact that in 9x most of our tools don't have basicauth, and that there are lots of bats tests on main that aren't on branch_9x... We'll see....
Perhaps more of the actions should run for PRs against the stable branch, e.g. add stable branch to https://github.com/apache/solr/blob/main/.github/workflows/tests-via-crave.yml#L6
I merged the changes to github workflows and triggered a new build. The precommit workflow now runs, but I cannot see the other workflows kicking off...
I'll take a stab at some of the failures...
Pushed some commits that at least makes Solr compile, and Fixes HealthCheckToolTest.
I propose that we revert most .bats files to testing original (deprecated) options, so we know the old cli API works.
I can see some confusion with /solr url suffix.
Okay, this is starting to turn into a bear... I'm working through dealing with the fact that in 9x most of our tools don't have basicauth, and that there are lots of bats tests on main that aren't on branch_9x... We'll see....
Wrt basicAuth support with --credentials, I just kept what 9x already has, i.e. no basic auth support for all the tools, so removed OPTION_CREDENTIALS from those. Kept it for AuthTool only, of course, with added support for -credentials.
Your changes look great, I especially like the helper for deprecated and current option look up. That type of logic is spread about in the work I did and I guess easy to goof!
Wrt the create tool vs create_collection and create_core, I tried to keep all three, may still be a bit un-polished.
EDIT Added some fixes to make create_collection and create_core accept new-style args. However, I wonder if we could simplify all this and only keep CreateTool and make bin/solr always call that one...
I believe I fixed the docker tests (gradlew testDocker), which failed at create_core_exec :)
But when building locally with gradlew dev, and then running bin/solr create_core -c foo, I get a scary error
java.lang.NoClassDefFoundError: org/apache/solr/logging/MDCSnapshot
I'm not able to proceed with the hunt more for now, but with PR tests running it should be easier for any committer to pick a test failure and fix it!
Fixed the test_auth.bats test in commit https://github.com/apache/solr/pull/2540/commits/f187b3e6fd762a9ba047bf6828193b56e83f6f74.
Reason: A mix of missing back-compat options parsing and missing /solr suffix on baseUrl.
Pushed several changes
- Finally fixed the
/solrurl suffix normalization issue I hope CreateToolnow used forcreate,create_coreandcreate_collection.solr create -husage options printed by SolrCLI (with 120 c width instead of 74)- Docker tests once again pass
- BATS tests for
authandcreatenow pass.
Still many BATS tests fail due to missing cmdline definitions...
Crave tests locally turned up with only two failed tests, and they definitely look relevant to this PR:
ERROR: The following test(s) have failed:
- org.apache.solr.cloud.PackageManagerCLITest.testPackageManager (:solr:core)
Test history: https://ge.apache.org/scans/tests?search.rootProjectNames=solr-root&tests.container=org.apache.solr.cloud.PackageManagerCLITest&tests.test=testPackageManager http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.PackageManagerCLITest.testPackageManager
Test output: /tmp/src/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.cloud.PackageManagerCLITest.txt
Reproduce with: ./gradlew :solr:core:test --tests "org.apache.solr.cloud.PackageManagerCLITest.testPackageManager" -Ptests.jvms=96 "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1 -XX:ReservedCodeCacheSize=120m" -Ptests.seed=7B8E87042F3A6E2C -Ptests.timeoutSuite=600000! -Ptests.file.encoding=ISO-8859-1
- org.apache.solr.cloud.SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection (:solr:core)
Test history: https://ge.apache.org/scans/tests?search.rootProjectNames=solr-root&tests.container=org.apache.solr.cloud.SolrCloudExampleTest&tests.test=testLoadDocsIntoGettingStartedCollection http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection
Test output: /tmp/src/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.cloud.SolrCloudExampleTest.txt
Reproduce with: ./gradlew :solr:core:test --tests "org.apache.solr.cloud.SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection" -Ptests.jvms=96 "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1 -XX:ReservedCodeCacheSize=120m" -Ptests.seed=7B8E87042F3A6E2C -Ptests.timeoutSuite=600000! -Ptests.file.encoding=ISO-8859-1
PackageManagerCLITest.testPackageManager was just a case of solrUrl and solr-url.. The other one may be about dealing with the /solr in the path or not...
I looked a bit more at SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection to see if I could figure it out... I thought I had some ideas, but no luck.. I believe we are struggling to detect if we are in cloud mode or not, and maybe we are not crafting the right url.. I checked, and the context route being only on main may be part of the problem:
* SOLR-16911: Establish /solr as the only host context supported by Solr, removing legacy ability to change this. Solr paths will only be either /solr or /api now. (Eric Pugh)
Just fixed the test_delete_collection.bats test (Linux only).
Getting there... All BATS tests now run, except these five:
not ok 32 solr help flag prints help in 1521ms
# (from function `assert_output' in file /home/runner/work/solr/solr/.gradle/node/packaging/node_modules/bats-assert/src/assert.bash, line 247,
# in test file test/test_help.bats, line 26)
# `assert_output --partial 'Usage: solr COMMAND OPTIONS'' failed
not ok 33 solr with no flags prints help in 133ms
# (in test file test/test_help.bats, line 30)
# `run -1 solr' failed, expected exit code 1, got 0
not ok 37 status help flag prints help in 183ms
# (from function `assert_output' in file /home/runner/work/solr/solr/.gradle/node/packaging/node_modules/bats-assert/src/assert.bash, line 247,
# in test file test/test_help.bats, line 55)
# `assert_output --partial 'usage: status'' failed
not ok 38 healthcheck help flag prints help in 196ms
# (from function `assert_output' in file /home/runner/work/solr/solr/.gradle/node/packaging/node_modules/bats-assert/src/assert.bash, line 247,
# in test file test/test_help.bats, line 63)
# `assert_output --partial 'usage: healthcheck'' failed
#
# -- output does not contain substring --
# substring (1 lines):
# usage: healthcheck
# output (14 lines):
#
# ERROR: Unrecognized or misplaced argument: --help!
not ok 78 status does not expose cli parameters to end user in 278ms
# (from function `assert_output' in file /home/runner/work/solr/solr/.gradle/node/packaging/node_modules/bats-assert/src/assert.bash, line 247,
# in test file test/test_status.bats, line 45)
# `assert_output --partial "ERROR: Unrecognized or misplaced argument: -solr!"' failed
All BATS are green 🦇
Please do a review of my latest couple commits to see if you agree with the decisions.
I looked a bit more at
SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollectionto see if I could figure it out... I thought I had some ideas, but no luck.. I believe we are struggling to detect if we are in cloud mode or not, and maybe we are not crafting the right url.. I checked, and the context route being only on main may be part of the problem:* SOLR-16911: Establish /solr as the only host context supported by Solr, removing legacy ability to change this. Solr paths will only be either /solr or /api now. (Eric Pugh)
Yea, I gave it a shot as well, but the test (BaseDistributedSearchTestCase) insists in using some dynamic host-context instead of /solr. I tried hardcoding context as in main, but to no avail. This is a tricky one.
Managed to fix SolrCloudExampleTest by adapt SolrCLI.getSolrClient to use sysProp hostContext from tests if exists. This made the test pass, but now the BATS test for PostTool started failing. Reason was that it used SolrCli.getSolrClient(solrUrl) to get its client for doing commit, with the full http://localhost:8983/solr/mycollection url as base URL. Which no longer works (should it?). So I modified PostTool to instead extract collection name from URL, get a clinent with baseURL and call client.commit(myCollection).
BUILD SUCCESSFUL in 5m 10s (crave from laptop)
@janhoy, just looking again at this one, thanks for all the efforts!!!! This looks like it's ready to commit? However before I click "Squash and merge", I wanted to confirm with you!!
It’s the best I could do with my available time. I have NOT done a thorough review of all options in all tools, just fixed the ones causing test failures.
The windows script lacks tests so who knows what works or not? Perhaps we are able to do follow up work on that front?
If no one else are willing to do a more thorough manual review, I believe merging this as is, is better than not merging.
And very important: some of my commits may include fixes that should be forward ported to main. We should make a placeholder JIRA to research what that might be.
Good thoughts here @janhoy... I will leave this open for another day or so, and then look to commit it. I will put out an appeal to the dev mailing list when 9.7 release process starts asking for additional testing of bin/solr.cmd...
I filed SOLR-17371 for forward-porting. Also synced PR branch with branch_9x.
This fails:
./gradlew :solr:core:test --tests "org.apache.solr.cloud.SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection" -Ptests.jvms=96 "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1 -XX:ReservedCodeCacheSize=120m" -Ptests.seed=44FFA474CD768986 -Ptests.timeoutSuite=600000! -Ptests.file.encoding=UTF-8
however this passes:
./gradlew :solr:core:test --tests "org.apache.solr.cloud.SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection"
okay, it fails on that tests.seed value. How can I step through the test in intellij with that seed value?
looks like logic to identify a nodename assumes one _ and we have two _ chars!