graylog2-server
graylog2-server copied to clipboard
Graylog 4.3 with Opensearch 1.3.2 - ES Version check cant be disabled.
I am installing the following on a Centos 7 box.
OpenJDK-11 Opensearch-1.3.2 Graylog 4.3.1 via
Expected Behavior
The version check in the server.conf should disable the ElasticSearch version check preventing the hostname verification issue as the ssl verification appears to be part of Elasticsearch version check. Does Opensearch 1.3.2 work with Graylog 4.3.1 as per https://docs.graylog.org/docs/installing-opensearch
Current Behavior
Version check happens preventing Graylog from starting
Steps to Reproduce (for bugs)
- Install Centos 7
- Install OpenJDK-11
- Install Opensearch via rpm.
- Setup certificates using https://opensearch.org/docs/latest/security-plugin/configuration/generate-certificates/
- Make sure certificates are imported into Java trust store and add root certs etc to Centos. Update root CA's.
- Opensearch.yml configuration at https://pastebin.com/vPMCkm9f and get OpenSearch to report ES version not OpenSearch version.
- Confirm with openssl s_client Excert
Acceptable client certificate CA names
/C=GB/ST=Jersey/L=St.Helier/O=DefenceLogic/OU=Security/CN=ROOT
Client Certificate Types: ECDSA sign, RSA sign, DSA sign
Requested Signature Algorithms: 0x07+0x08:0x08+0x08:ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:0x04+0x08:0x05+0x08:0x06+0x08:0x09+0x08:0x0A+0x08:0x0B+0x08:RSA+SHA256:RSA+SHA384:RSA+SHA512:DSA+SHA256:ECDSA+SHA224:RSA+SHA224:DSA+SHA224:ECDSA+SHA1:RSA+SHA1:DSA+SHA1
Shared Requested Signature Algorithms: ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:RSA+SHA256:RSA+SHA384:RSA+SHA512:DSA+SHA256:ECDSA+SHA224:RSA+SHA224:DSA+SHA224:ECDSA+SHA1:RSA+SHA1:DSA+SHA1
Peer signing digest: SHA512
Server Temp Key: ECDH, P-256, 256 bits
---
SSL handshake has read 1530 bytes and written 427 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES256-GCM-SHA384
Session-ID: 7DBB1ED24854DB4F7959EB151500F48F5F825B7CE9ADEDC417526B57390C3F4C
Session-ID-ctx:
Master-Key: D5871EF87078B84697CC0D555AFFBB116D07057D927D9425220E89E37832C940C4FE5808909F0275ED5C77F45F8FB19B
Key-Arg : None
Krb5 Principal: None
PSK identity: None
PSK identity hint: None
Start Time: 1655231599
Timeout : 300 (sec)
Verify return code: 0 (ok)
---
- Setup graylog server .conf as shown at https://pastebin.com/0SKJKydE
- Setting elasticsearch_disable_version_check = true or to elasticsearch_disable_version_check = false has no effect
Context
Excert from server log.
2022-06-14T19:30:33.662+01:00 INFO [cluster] Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=5000}
2022-06-14T19:30:33.704+01:00 INFO [cluster] Cluster description not yet available. Waiting for 30000 ms before timing out
2022-06-14T19:30:33.730+01:00 INFO [connection] Opened connection [connectionId{localValue:1, serverValue:1}] to localhost:27017
2022-06-14T19:30:33.737+01:00 INFO [cluster] Monitor thread successfully connected to server with description ServerDescription{address=localho st:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 2, 20]}, minWireVersion=0, maxWireVersion=8, maxDocum entSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=2739143}
2022-06-14T19:30:33.749+01:00 INFO [connection] Opened connection [connectionId{localValue:2, serverValue:2}] to localhost:27017
2022-06-14T19:30:33.775+01:00 INFO [connection] Closed connection [connectionId{localValue:2, serverValue:2}] to localhost:27017 because the po ol has been closed.
2022-06-14T19:30:33.777+01:00 INFO [MongoDBPreflightCheck] Connected to MongoDB version 4.2.20
2022-06-14T19:30:34.029+01:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Hostname opensearch.cyberkryption.local n ot verified:
certificate: sha256/KeB6DfLCNFq3561pPhy8Zc/+oU6pmSySnrPyzHbwfvQ=
DN: CN=opensearch.cyberkryption.local, OU=Security, O=DefenceLogic, L=St.Helier, ST=Jersey, C=GB
subjectAltNames: []. - Hostname opensearch.cyberkryption.local not verified:
certificate: sha256/KeB6DfLCNFq3561pPhy8Zc/+oU6pmSySnrPyzHbwfvQ=
DN: CN=opensearch.cyberkryption.local, OU=Security, O=DefenceLogic, L=St.Helier, ST=Jersey, C=GB
subjectAltNames: [].
Your Environment
- Graylog Version: 4.3.1
- Java Version: OpenJDK-11
- Opensearch Version: 1.3.2
- MongoDB Version: 4.2.20-1
- Operating System: Centos 7
- Browser version: n/a
@cyberkryption thank you for your report, a couple of things:
- setting OS to report ES version is not necessary any more
- having it in there now, try to configure the elastic version in graylog using
elasticsearch_version = 7
and see if it makes a difference for you - because version probing should be skipped that way, too. - at a first glance,
elasticsearch_disable_version_check
might actually be buggy - still, what makes you sure that the hostname check is not part of the ssl handshake and will still persist even after disabling the version check?
I'll setup a test install and try to verify what you see, but this will take a little more time
I wil try seeting the elasticsearch_version=7
in the graylog server.conf.
I want to confirm that the hostname check whether the hostname check is part of Elasticsearch version check. If it is not and problem still persists, it would be strange as the host os is reporting that all certificates etc are good.
I will reconfigure and report back.
OK, I rechecked.
I set elasticsearch_rversion=7
and elasticsearch_disable_version_check = true
in server.conf
I also set tried setting hostanme to fqdn of server using hostnamectl but I still get tyhe following in the graylog server logs.
Hostnamectl output
hostnamectl
Static hostname: opensearch.cyberkryption.local
Icon name: computer-vm
Chassis: vm
Machine ID: e94d9047cfcb423aaa6f996b01e60de5
Boot ID: 30ae6365c0e145f59de9a9ed81f4647f
Virtualization: vmware
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-1160.66.1.el7.x86_64
Architecture: x86-64
2022-06-14T19:30:33.274+01:00 INFO [CmdLineTool] Running with JVM arguments: -Xms1g -Xmx1g -XX:NewRatio=1 -XX:+ResizeTLAB -XX:-OmitStackTraceInFastThrow -Djdk.tls.acknowledgeCloseNotify=true -Dlog4j2.formatMsgNoLookups=true -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -Dlog4j.configurationFile=file:///etc/graylog/server/log4j2.xml -Djava.library.path=/usr/share/graylog-server/lib/sigar -Dgraylog2.installation_source=rpm
2022-06-14T19:30:33.662+01:00 INFO [cluster] Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=5000}
2022-06-14T19:30:33.704+01:00 INFO [cluster] Cluster description not yet available. Waiting for 30000 ms before timing out
2022-06-14T19:30:33.730+01:00 INFO [connection] Opened connection [connectionId{localValue:1, serverValue:1}] to localhost:27017
2022-06-14T19:30:33.737+01:00 INFO [cluster] Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 2, 20]}, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=2739143}
2022-06-14T19:30:33.749+01:00 INFO [connection] Opened connection [connectionId{localValue:2, serverValue:2}] to localhost:27017
2022-06-14T19:30:33.775+01:00 INFO [connection] Closed connection [connectionId{localValue:2, serverValue:2}] to localhost:27017 because the pool has been closed.
2022-06-14T19:30:33.777+01:00 INFO [MongoDBPreflightCheck] Connected to MongoDB version 4.2.20
2022-06-14T19:30:34.029+01:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Hostname opensearch.cyberkryption.local not verified:
certificate: sha256/KeB6DfLCNFq3561pPhy8Zc/+oU6pmSySnrPyzHbwfvQ=
DN: CN=opensearch.cyberkryption.local, OU=Security, O=DefenceLogic, L=St.Helier, ST=Jersey, C=GB
subjectAltNames: []. - Hostname opensearch.cyberkryption.local not verified:
certificate: sha256/KeB6DfLCNFq3561pPhy8Zc/+oU6pmSySnrPyzHbwfvQ=
DN: CN=opensearch.cyberkryption.local, OU=Security, O=DefenceLogic, L=St.Helier, ST=Jersey, C=GB
subjectAltNames: [].
2022-06-14T19:30:34.030+01:00 INFO [VersionProbe] Elasticsearch is not available. Retry #1
My understanding of ERROR [VersionProbe]
indicates that it is something to do with version checking?
That is why I went down the path of trying to disable ES version check.
Any help appreciated.
Thanks for checking/confirming. I'll try to reproduce it.
I can upload the exported vm for you to download and save you config time if you want.
Hello @cyberkryption,
First of all, I think that the real issue here is that the Graylog server can't verify certificate of the Opensearch server. Are you sure that the graylog process has correctly configured a java truststore, which contains the certificate used in opensearch?
As @janheise mentioned, even if you would be able to skip the version check, you would be unable to communicate with your OS instance, because the SSL error blocks all communication with the instance. You would get similar exception elsewhere. The version check only triggers a simple HTTPS request to the configured OS address and tries to read the opensearch version from the json response. It's the same type of HTTP communication as all other requests between Graylog and Opensearch.
If you want to test your setup without SSL enabled between Graylog and Opensearch, you can disable it by configuring the plugins.security.ssl.http.enabled
option in OS conf.
If you still want to try to disable the version check and see what happens: the error from the stacktrace is not coming from the ESVersionCheckPeriodical as we originally assumed, but from the SearchDbPreflightCheck, which tries to verify that a valid search instance is available. You can try to disable all pre-flight checks by setting skip_preflight_checks
to true
in the Graylog configuration. But again, I think you'll just see the very same SSL error from a different part of the app, as there will be no communication possible with your OS instance.
Best regards, Tomas
Hi Tomas,
An update. I have checked the certificates are in the cacerts trust store. I imported them in .pem format as below.
[cyberkryption@opensearch certificates]$ sudo keytool -list -alias dlrootca -cacerts
Enter keystore password:
dlrootca, 24 Jun 2022, trustedCertEntry,
Certificate fingerprint (SHA-256): 94:56:A4:8B:DA:B0:AB:91:78:D0:6A:06:7D:0A:B2:20:D7:A1:B8:1B:E2:4D:1A:F0:5D:17:06:41:1E:75:A1:97
[cyberkryption@opensearch certificates]$ sudo keytool -list -alias opensearch -cacerts
Enter keystore password:
opensearch, 24 Jun 2022, trustedCertEntry,
Certificate fingerprint (SHA-256): A7:5F:C4:22:34:F5:A1:0B:6D:86:2F:9A:73:FF:1E:92:29:4F:80:00:42:EA:3B:11:B9:D0:E8:C1:86:05:7F:73
[cyberkryption@opensearch certificates]$
I started graylog and the server still failed to boot past the ES check.
Next , i set skip_preflight_checks
to true
in graylog configuration file.
2022-06-24T12:06:20.691+01:00 INFO [JerseyService] Started REST API at <opensearch.cyberkryption.local:9000>
2022-06-24T12:06:20.692+01:00 INFO [ServiceManagerListener] Services are healthy
2022-06-24T12:06:20.692+01:00 INFO [InputSetupService] Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE]
2022-06-24T12:06:20.693+01:00 INFO [ServerBootstrap] Services started, startup times in ms: {FailureHandlingService [RUNNING]=1, UserSessionTerminationService [RUNNING]=16, GracefulShutdownService [RUNNING]=22, InputSetupService [RUNNING]=28, MongoDBProcessingStatusRecorderService [RUNNING]=52, LocalKafkaMessageQueueWriter [RUNNING]=61, ConfigurationEtagService [RUNNING]=77, PrometheusExporter [RUNNING]=78, EtagService [RUNNING]=79, OutputSetupService [RUNNING]=79, BufferSynchronizerService [RUNNING]=80, JobSchedulerService [RUNNING]=80, StreamCacheService [RUNNING]=81, LocalKafkaMessageQueueReader [RUNNING]=82, UrlWhitelistService [RUNNING]=94, LocalKafkaJournal [RUNNING]=97, LookupTableService [RUNNING]=116, PeriodicalsService [RUNNING]=131, JerseyService [RUNNING]=1506}
2022-06-24T12:06:20.694+01:00 INFO [ServerBootstrap] Graylog server up and running.
I sent a test message in using the following
echo {"message":"Hello from the tcp stack","host":"cyberkryption023"} | ncat 192.168.1.234 12201
It appears to be working
Can you point me to what checks are disabled as a result of skip_preflight_checks
to true
?
I have this issue as well, converted a Debian 10 graylog 4.3 instance from elasticsearch-oss
to opensearch
, now it will not start.
update: I figured out my issue, at least in my case.
opensearch
was binding itself to 127.0.1.1:9200
. instead of the default 127.0.0.1:9200
. I had to update the ES binding in /etc/graylog/server/server.conf
:
elasticsearch_hosts = http://127.0.1.1:9200
@luckman212 Mine is binding to an IP address in my local network.
curl -XGET https://opensearch.cyberkryption.local:9200 -u 'admin:admin' --insecure
{
"name" : "opensearch",
"cluster_name" : "graylog",
"cluster_uuid" : "F_m6D20qSRm9ttiF0ySuBw",
"version" : {
"number" : "7.10.2",
"build_type" : "rpm",
"build_hash" : "6febcf7b53ff189de767e460e905e9e5aeecc8cb",
"build_date" : "2022-05-04T03:59:23.756957Z",
"build_snapshot" : false,
"lucene_version" : "8.10.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}
[cyberkryption@opensearch ~]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.234 opensearch.cyberkryption.local opensearch
There are currently 3 pre-flight checks:
- disk journal verification (enough space, writable directory, sizing)
- Elasticsearch / Opensearch availability and compatibility
- Mongodb availability and compatibility
With the skip_preflight_checks
you enable/disable all of them at once. What surprises me is that you can actually communicate with the OS instance if you skip the checks. I also asked @mpfz0r if he can check your report and maybe he has same additional ideas.
Hi @cyberkryption
I think the problem is that your certificate does not have any subjectAltNames
configured.
You can also see this in the logs:
certificate: sha256/KeB6DfLCNFq3561pPhy8Zc/+oU6pmSySnrPyzHbwfvQ=
DN: CN=opensearch.cyberkryption.local, OU=Security, O=DefenceLogic, L=St.Helier, ST=Jersey, C=GB
subjectAltNames: []. - Hostname opensearch.cyberkryption.local not verified:
We are using OkHttp for the VersionProbe, which does not look at the certs CN: https://github.com/square/okhttp/issues/4966
If you disable the version check entirely, only the elastic client will be used. It seems elastic uses the apache http client, which still cares about the CN.
I'd suggest to recreate the certificates using something like openssl
-addext "subjectAltName = DNS:opensearch.cyberkryption.local"
I also filed a bug over at OpenSearch to see whether they can improve their documentation: https://github.com/opensearch-project/documentation-website/issues/730
@cyberkryption did you have a chance to recreate your certificates? Can we close this ticket?
I faced that issue with fresh graylog 4.3.5-1 installation with elasticsearch-oss 7.10.2 and elasticsearch x-pack 7.10.2 as well. The same error messages in the log where seen like mentionend in https://github.com/Graylog2/graylog2-server/issues/12897#issuecomment-1159001173. In running enviroments with graylog 4.2 i can not expierence those issue after an upgrade.
I can confirm the setting skip_preflight_checks= True
as a workaround.
Hi Marco,
Please close ticket as i wont have the time for a few weeks to retest.
Cyberkryption
On Wed, 13 Jul 2022, 09:10 Marco Pfatschbacher, @.***> wrote:
@cyberkryption https://github.com/cyberkryption did you have a chance to recreate your certificates? Can we close this ticket?
— Reply to this email directly, view it on GitHub https://github.com/Graylog2/graylog2-server/issues/12897#issuecomment-1182906994, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWYQ3DJJZX2VLKI33GFXJTVTZ2ZDANCNFSM5YYV4L3Q . You are receiving this because you were mentioned.Message ID: @.***>
@xtruthx could you show me the output of openssl x509 -text -in your-elastic-ssl-cert.pem
?
I'm considering this resolved. Please open a new ticket if that's not the case.