Running solr_wrapper a 2nd time with "persist" option set does not find the existing core
Versions used:
- Using
solr_wrapper 0.12.0 - Using Solr 6.4.1 (the version selected by
solr_wrapper)
Expected Result:
I should be able to:
- run
solr_wrapperfor a 1st time, with a configuration that has thepersistflag set totrue. - stop
solr_wrapper - run
solr_wrappera 2nd time, with the same config, and not get an error.
Actual Results:
Running solr_wrapper a 2nd time, with the same config, gives an error (see below).
To Replicate:
-
create a config file named
.solr_wrapper.ymlcontaining the following:collection: dir: solr/config/ name: my-core persist: true -
Run
solr_wrapperfor the first time (without the core having been created yet) with the specified config, like this...solr_wrapper --config .solr_wrapper.yml -
At this point, Solr starts.
-
Stop Solr using Ctrl+C
-
Run
solr_wrapperagain, with the same command and config file...solr_wrapper --config .solr_wrapper.yml -
At this point, I get the following error...
ERROR: Core 'my-core' already exists! Checked core existence using Core API command: http://localhost:8983/solr/admin/cores?action=STATUS&core=my-core
Background:
- I've tracked this down to the fact that
SolrWrapper::Client#core?is returning false when checking for the core namedmy-core. - In my case, the core was created (during the first start of
solr_wrapper) in/tmp/solr-6.4.1/server/solr/my-core, and after getting the error, I can still see the directory corresponding tomy-coreon the filesystem. - The Solr api call being made to check for the core inside of
SolrWrapper::Client#core?looks like this:admin/cores?action=STATUS&wt=json&core=my-core' - The response returned by Solr is:
{"responseHeader":{"status":0,"QTime":1},"initFailures":{},"status":{"my-core":{}}} - This would indicate that
SolrWrapper::Client#core?is actually behaving correctly and returning false, given the response from Solr. However, Solr itself is not finding the core that it created previously. - This causes
SolrWrapper::Client#exists?to returnfalse, which in turn causes this conditional to also returnfalse, which in turn runs the lineexec('create', create_options). - And it is the call to
exec('create', create_options)where Solr is somehow able to find the core it couldn't find before, and throws the error for trying to create a core that already exists.
This is the same as #69
But #69 doesn't refer to the persist option. If the persist option is set to false, and you try to create a core that already exists, then it's my understanding that an error should be raised. IOW, you need to be explicit when you want to re-use an existing core.
@afred I think that #69 was just less well described. I think the same fix will wipe out both issues.
It appears to me that this is a timing related issue.
I can in no way reliably trigger or eliminate the error using the gem as released. On a low-resource VM, the issue seems to be more prevalent, i.e. I get the "Core ... already exists" error multiple successive times before the solr starts as expected. See example runs where I did nothing but try launching the wrapper again in this gist https://gist.github.com/mark-dce/0a6bb38adde0d5b7645ed58778158575
If I add a status check at the beginning of the create method something like
def create(options = {})
sleep 5 unless started?
options[:name] ||= SecureRandom.hex
# etc.
the solr service appears to start completely reliably - see this gist of behavior after the change https://gist.github.com/mark-dce/7aaede62802adea79ab826a48af23ad8
As I buried at the end of a massive comment on the very-related #69, if this gem does undergo future development it might be best to find a better core existence check. This page has a couple of newer answers that indicate the accepted solution (as used in this gem) is not reliable and/or has been superseded in newer versions of Solr.
But perhaps it will just wane as more and more people develop in containers instead. I still use and appreciate it though! :)