ElasticSearch 7.11.1 service crash on Windows 2019
Elasticsearch version (bin/elasticsearch --version): 7.11.1
Plugins installed: [none]
JVM version (java -version):
OS version (uname -a if on a Unix-like system): Windows 2019
Description of the problem including expected versus actual behavior:
Steps to reproduce: Start the service from Services manager
7.11.1 installs fine on Windows 2019, but it does not start after installation. When trying to start the Service, it runs for a bit, and stops after a few seconds
Provide logs (if relevant):
Faulting application name: elasticsearch.exe, version: 7.11.1.0, time stamp: 0x602a80b1 Faulting module name: KERNELBASE.dll, version: 10.0.17763.1518, time stamp: 0xff301d3c Exception code: 0xe0434352 Fault offset: 0x00000000000396c9 Faulting process id: 0x930 Faulting application start time: 0x01d7064152fb189f Faulting application path: C:\Program Files\Elastic\Elasticsearch\7.11.1\bin\elasticsearch.exe Faulting module path: C:\Windows\System32\KERNELBASE.dll Report Id: 2b90eeb5-eb2a-457f-8967-46c416b938e3 Faulting package full name: Faulting package-relative application ID:
Application: elasticsearch.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: Elastic.ProcessHosts.Process.StartupException
at Elastic.ProcessHosts.Process.ProcessBase.HandleException(System.Exception)
at System.Reactive.ObserverBase1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].OnError(System.Exception) at System.Reactive.Observer1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].OnError(System.Exception)
at System.Reactive.Linq.ObservableImpl.AsObservable1+_[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].OnError(System.Exception) at System.Reactive.AutoDetachObserver1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].OnErrorCore(System.Exception)
at System.Reactive.ObserverBase1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].OnError(System.Exception) at Elastic.ProcessHosts.Process.ObservableProcess+<>c__DisplayClass22_0.<CreateProcessExitSubscription>b__0(System.Reactive.EventPattern1<System.Object>)
at System.Reactive.AnonymousSafeObserver`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].OnNext(System.__Canon)
at System.EventHandler.Invoke(System.Object, System.EventArgs)
at System.Diagnostics.Process.OnExited()
at System.Diagnostics.Process.RaiseOnExited()
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(System.Object, Boolean)
Can anyone else verify this ?
Pinging @elastic/es-delivery (Team:Delivery)
@plancked Did you use the MSI installer or just download the .zip distribution and run the .bat script to install the service?
@plancked Did you use the MSI installer or just download the .zip distribution and run the .bat script to install the service?
I used the MSI installer.
@ygel could you provide some insight on this?
Hi @ygel @mark-vieira any updates on this ? thanks
Hello @plancked. Windows Service functionality is implemented by a wrapper process that spawns a child Elasticsearch Java process. Exception trace tells me that that wrapper process noticed that java process exited unexpectedly. Please attach logs from Elasticsearch process, they will likely contain information about the true cause.
@ygel sure, i can provide the logs. Could you tell me how? Do you need logs from windows event viewer or something else? Thanks.
[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
This is the issue. You need to configure node discovery settings.
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-settings.html
Right, but what if I don't use a cluster?
MSI should have configured a single-node cluster for you automatically. Have you made any changes during install or used default settings?
I used the default settings, except for the network host where I used an IP address in the LAN:
network.host: 10.1.1.28
Now i have entered this:
discovery.seed_hosts: 10.1.1.28:9300
And the service runs, but the application (Xenforo) cannot connect to it. Also, doing a netstat on the Elastic Search server does return any listen on port 9200.
discovery.seed_hosts: 10.1.1.28:9300
Please specify only IP, without the port. I realize that the self-documenting elasticsearch.yml is likely absent from the MSI distribution, where the installer writes configuration without comments. Here's the relevant excerpt:
--------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["host1", "host2"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
~Please try it as a single value first: discovery.seed_hosts: 10.1.1.28~ EDIT: It should be a list.
Yes, i did try only IP for seed_hosts, but it didnt work.
But I uninstalled Elastic Search, and re-installed and used all the configuration (host, seed, master node) and now it works. This is the elasticsearch.yml file that was generated by the MSI installer:
cluster.initial_master_nodes:
- HP
cluster.name: elasticsearch
discovery.seed_hosts:
- 10.1.1.28
http.port: 9200
network.host: 10.1.1.28
node.data: true
node.ingest: true
node.master: true
node.max_local_storage_nodes: 1
node.name: HP
path.data: C:\ProgramData\Elastic\Elasticsearch\data
path.logs: C:\ProgramData\Elastic\Elasticsearch\logs
transport.tcp.port: 9300
xpack.license.self_generated.type: basic
xpack.security.enabled: false
So I'm guessing something is wrong with the MSI installer when you only use all default settings, except the network host field.
I'm glad it worked out in the end and I appreciate you sticking around for diagnosis!
file that was generated by the MSI installer:
Without the code block1 it's impossible to tell if YAML formatting is not broken.
... when you only use all default settings, except the network host field.
Please clarify what was the value for the network host.
I am currently starting on a round of updates for MSI, thank you for pointing out a potential rotten code path. Code rot is real :)
1 By code block I mean using three back-ticks (same key as ~) to preserve formatting. YAML is (unfortunately) extremely picky at formatting and indentation.
I edited the previous post. Please note that this is the configuration that works.
On the previous installation, I entered 10.1.1.28 for the network host. That's the only thing I did, I did not touch anything else.
So you should be able to reproduce it.
... I entered 10.1.1.28 for the network host. That's the only thing I did, I did not touch anything else.
Thank you, I'll try to repro
cluster.initial_master_nodes: - HP ...
Value HP for initial_master_nodes looks mildly suspicious, does it satisfy requirements outlined in important settings:
Identify the initial master nodes by their node.name, which defaults to their hostname. Ensure that the value in cluster.initial_master_nodes matches the node.name exactly. If you use a fully-qualified domain name (FQDN) such as master-node-a.example.com for your node names, then you must use the FQDN in this list. Conversely, if node.name is a bare hostname without any trailing qualifiers, you must also omit the trailing qualifiers in cluster.initial_master_nodes.
EDIT: it does node.name: HP
Moving this to windows-installers repo
cluster.initial_master_nodes: - HP ...Value
HPforinitial_master_nodeslooks mildly suspicious, does it satisfy requirements outlined in important settings:Identify the initial master nodes by their node.name, which defaults to their hostname. Ensure that the value in cluster.initial_master_nodes matches the node.name exactly. If you use a fully-qualified domain name (FQDN) such as master-node-a.example.com for your node names, then you must use the FQDN in this list. Conversely, if node.name is a bare hostname without any trailing qualifiers, you must also omit the trailing qualifiers in cluster.initial_master_nodes.
EDIT: it does
node.name: HP
Hi, that is not the actual value of the file. I edited it for privacy reasons.
Hello,
This incident also happened to me with Elasticsearch 7.10.2.0, MSI installation, running on 64-bit Windows 10. Elasticsearch was running fine for more than 3 weeks as a service with a developers' mode configuration:
bootstrap.memory_lock: false
cluster.initial_master_nodes:
- STBGR10
cluster.name: Ioannas_elasticsearch
http.port: 9200
node.data: true
node.ingest: true
node.master: true
node.max_local_storage_nodes: 1
node.name: STBGR10
path.data: C:\ProgramData\Elastic\Elasticsearch\data
path.logs: C:\ProgramData\Elastic\Elasticsearch\logs
transport.tcp.port: 9300
xpack.license.self_generated.type: basic
xpack.security.enabled: false
After a system reboot including software updates - what a surprise for windows - I got the same crash as @plancked .
The solution was to include the following in my elasticsearch.yml:
network.host: <FQDN>
discovery.seed_hosts:
- <FQDN>
And Elasticsearch is up and running again!
Thank you @IoannaT !