[Bug] SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment
Seeing error
javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16) during OpenSearch startup
Error: 9-04T06:39:28,837][ERROR][o.o.s.s.t.SecuritySSLNettyTransport] [smoketestnode] Exception during establishing a SSL connection: javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:360) ~[?:?]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:303) ~[?:?]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:298) ~[?:?]
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:134) ~[?:?]
at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:681) ~[?:?]
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:636) ~[?:?]
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:454) ~[?:?]
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:433) ~[?:?]
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637) ~[?:?]
at io.netty.handler.ssl.JdkSslEngine.unwrap(JdkSslEngine.java:92) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.ssl.JdkAlpnSslEngine.unwrap(JdkAlpnSslEngine.java:163) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:309) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1436) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1329) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1378) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529) ~[netty-codec-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468) ~[netty-codec-4.1.97.Final.jar:4.1.97.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[netty-codec-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1[410](https://github.com/opensearch-project/security/actions/runs/6069984984/job/16465215605?pr=3296#step:8:423)) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:[420](https://github.com/opensearch-project/security/actions/runs/6069984984/job/16465215605?pr=3296#step:8:433)) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) [netty-transport-4.1.97.Final.jar:4.1.97.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.97.Final.jar:4.1.97.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.97.Final.jar:4.1.97.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
at sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1894) ~[?:?]
at sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:240) ~[?:?]
at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:197) ~[?:?]
at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:160) ~[?:?]
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[?:?]
... 29 more
Expected result
Should not see errors from underlying system configuration
Additional context
- Generated during plugin install tests, see log https://github.com/opensearch-project/security/actions/runs/6069984984/job/16465215605
- There might be ways to mitigate this by catching the error, or maybe there is an issue with an upstream system? See https://github.com/xnio/xnio/pull/282/commits/2ae0c1b368c6e0886f9c03030ec443c346cb0a71#diff-9f162fd704e4d5eaff8abb932a923ae868adf3f642d85bc6f5bcfd5f7fe1763bR695
Known issue in JDK: https://bugs.openjdk.org/browse/JDK-8221218. Maybe it's been resolved in JDK20
I have the same issue using the latest helm charts and docker images. interestingly it worked for a while, after re-creating the CA and certs it stopped working consistently.
Got the same issue. During cluster migration from 2.8 to 2.9 one of the node could not start. What is the root cause so far is not clear.
[Triage] Going to leave this untriaged since we dont really know how to move forward yet. We can keep the issue though and add more info if we encounter this further.
[Triage] Per @willyborankin's suggestion, you can reproduce it by starting a migration and adding a new node during migration with the same certificate. Any fixes for the issues will be accepted. Likely a change around 1.7.6 or jdk20.
PR with BC 1.76 was merged in OpenSearch.
Hi guys. Problem is still persistent in v2.11.0. I would like to kindly ask you let us know, when fix will be available in particular version.
Also having this issue using latest tag. Note that this rule is off: plugins.security.ssl.transport.enforce_hostname_verification: false
And i am using proper plugins.security.nodes_dn settings.
bug not resolved (15.01.2024), use tls 1.2 instead tls 1.3 use VM arg: -Djdk.tls.client.protocols=TLSv1.2 or if you use netty config ssl handler: SslHandler handler = sslContext.newHandler(socketChannel.alloc()); handler.engine().setEnabledProtocols(new String[] {"TLSv1.2"});
Seems like a bug in JDK: https://bugs.openjdk.java.net/browse/JDK-8221218
See this forum post for more details: https://forum.opensearch.org/t/cluster-does-not-initialize-javax-net-ssl-sslhandshakeexception-insufficient-buffer-remaining-for-aead-cipher-fragment/2845/5
Like others have said this seems to be a known issue with how the JDK handles TLS:
https://bugs.openjdk.org/browse/JDK-8221218
If you look at the comments here, they seem to suggest fixes have occurred but obviously this is not the case... It is also worth pointing out that neither of the fixes were actually intended to address this specific issue. I am not sure why they closed this issue as resolved when the linked changes were for separate bugs...
Further examples of the issue being known:
Oracle support page (https://support.oracle.com/knowledge/Middleware/2519569_1.html)
Applies to: Oracle WebLogic Server - Version 12.1.3.0.0 and later
Another project running into this issue:
https://forum.portswigger.net/thread/complete-proxy-failure-due-to-java-tls-bug-1e334581
Thanks for reporting this. It is a known unresolved bug in OpenJDK
One last attempt to fix this would be looking at increasing the Bouncycastle version:
https://github.com/tkohegyi/mitmJavaProxy/issues/12
I use JDK15 and later + org.bouncycastle/bcpkix-jdk18on/1.71.1 and I cannot repro it anymore
I will try to do this and see if it is possible but I am not sure about reproducing the issue consistently so it may be challenging to test.
@LHozzan @Thrallix @VovkaSOL We've been having no luck with this issue, one thing I'm trying to understand is how impactful this issue is to you. From our evidence it looks like this has only happened during cluster startup. If its a startup issue is unfortunate, but limited in overall impact. Whereas - if this issue happens intermittently on a cluster and takes down a node then we should invest more time, can you help provide use with details of your reproduction?
I am seeing this issue consistently after trying to change cert providers. I did a full cluster restart and I'm getting that error on all of my nodes. I don't know if it's relevant but the old certs we were using were RSA, while the new certs are id-ecPublicKey
@reshippie (any anyone else experience this issue) could you include the operation system version / jdk version / opensearch distro version. Basic cluster topology (3 data nodes, 2 cluster managers). Anything interesting about your security configuration.
If you don't feel conformable posting that information publicly, feel free to reach out to me first on our slack instance, I'm Peter Nied or email pet ern @ am az on .co m (remove the spaces)
We're running: Debian 10.13 Opensearch 2.9.0 bundled Java 17.0.7 6 data nodes, 3 managers, 1 coordinating node (for Dashboards)
I don't think there's anything interesting in our security config
plugins.security.ssl_cert_reload_enabled: true
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.advanced_modules_enabled: true
plugins.security.nodes_dn:
- 'CN=dashboards-*-mgmt'
- 'CN=esmaster-*-mgmt'
- 'CN=elasticsearch-*-mgmt'
- 'CN=osdata-*-mgmt'
# Trasnport layer TLS
plugins.security.ssl.transport.enabled: true
plugins.security.ssl.transport.pemkey_filepath: ssl/{{ ansible_hostname }}-mgmt.pk8
plugins.security.ssl.transport.pemcert_filepath: ssl/{{ ansible_hostname }}-mgmt.crt
plugins.security.ssl.transport.pemtrustedcas_filepath: ssl/{{ ansible_hostname }}-mgmt.issuer.crt
plugins.security.ssl.transport.truststore_filepath: cacerts
#
# REST layer TLS
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemkey_filepath: ssl/{{ ansible_hostname }}-mgmt.pk8
plugins.security.ssl.http.pemcert_filepath: ssl/{{ ansible_hostname }}-mgmt.crt
plugins.security.ssl.http.pemtrustedcas_filepath: ssl/{{ ansible_hostname }}-mgmt.issuer.crt
plugins.security.restapi.roles_enabled: ["admin_role", "security_rest_api_access"]
plugins.security.authcz.admin_dn: CN=DOMAIN.org
I tried the solution posted by @VovkaSOL. Adding -Djdk.tls.client.protocols=TLSv1.2 did not make the error go away.
I looked into updating the bouncycastle version as mentioned above. We would need to follow something similar to when it was moved to https://github.com/opensearch-project/OpenSearch/pull/8247
At the time, @willyborankin only bumped to 15to18 because of the multi-release jars. I don't know if it feasible to move past that point/if opensearch can handle the later version. @willyborankin do you know?
I looked into updating the bouncycastle version as mentioned above. We would need to follow something similar to when it was moved to opensearch-project/OpenSearch#8247
At the time, @willyborankin only bumped to 15to18 because of the multi-release jars. I don't know if it feasible to move past that point/if opensearch can handle the later version. @willyborankin do you know?
@scrawfor99 Not sure about it, we still support JDK 1.8 build AFAIK.
@willyborankin, I think 18on will still work with 1.8. I saw you made the swap to 15to18 though and not 18on in the linked PR so was not sure whether you knew what was or was not compatible.
With the updates the bouncy castle, I am going to close this issue as this is the most we can currently do to resolve the exception. Based on some other discussions, the update to bouncy castle should help resolve the failures.
Hi @peternied .
Sorry for delay response.
We've been having no luck with this issue, one thing I'm trying to understand is how impactful this issue is to you. From our evidence it looks like this has only happened during cluster startup. If its a startup issue is unfortunate, but limited in overall impact. Whereas - if this issue happens intermittently on a cluster and takes down a node then we should invest more time, can you help provide use with details of your reproduction?
This problem in our infrastructure occurring random on all nodes roles. If problem occurred only on one coordinator node, second replica is working, but if both replicas are hitting by the problem, there are basically complete cluster useless, no matter, that managers and data nodes are working fine. Same situation, if any another roles are affected in same time or with some delay. We have monitoring and watching, if components before OpenSearch cluster can connect to it, but it is inconvenient.
We actually using default community Docker image opensearchproject/opensearch:2.11.1, but only little time. We have actually clusters only in AWS and M$ and I can observe same problem on both providers.
Basic cluster topology (3 data nodes, 2 cluster managers). Anything interesting about your security configuration.
The problem occurring in our both using setups. I mean:
- one multirole node
- 2 coordinators, 2 manager, 2 data nodes
Based on my observation it seems, that more often occurring on multirole, but I not have any exact data.
@scrawfor99 OK, lets wait for next release (2.12.x) and hopefully problem will be fixed there. If it will be persistent, I will let you know.
Hi @LHozzan, do you use Wireguard/IPSec as an addition encryption mechanism for the communication between nodes? If yes the problem could be related to Wireguard/IPSec configurtaion
After installation(2 data node, 1 manager node) with the demo config, I have updated the opensearch.yml with the following
plugins.security.ssl.transport.pemcert_filepath: tls.crt
plugins.security.ssl.transport.pemkey_filepath: tls.key
plugins.security.ssl.transport.pemtrustedcas_filepath: ca.crt
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: tls.crt
plugins.security.ssl.http.pemkey_filepath: tls.key
plugins.security.ssl.http.pemtrustedcas_filepath: ca.crt
plugins.security.allow_unsafe_democertificates: false
plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn: ['CN=admin']
plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: [all_access, security_rest_api_access]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices:
- .plugins-ml-agent
- .plugins-ml-config
- .plugins-ml-connector
- .plugins-ml-controller
- .plugins-ml-model-group
- .plugins-ml-model
- .plugins-ml-task
- .plugins-ml-conversation-meta
- .plugins-ml-conversation-interactions
- .plugins-ml-memory-meta
- .plugins-ml-memory-message
- .plugins-ml-stop-words
- .opendistro-alerting-config
- .opendistro-alerting-alert*
- .opendistro-anomaly-results*
- .opendistro-anomaly-detector*
- .opendistro-anomaly-checkpoints
- .opendistro-anomaly-detection-state
- .opendistro-reports-*
- .opensearch-notifications-*
- .opensearch-notebooks
- .opensearch-observability
- .ql-datasources
- .opendistro-asynchronous-search-response*
- .replication-metadata-store
- .opensearch-knn-models
- .geospatial-ip2geo-data*
- .plugins-flow-framework-config
- .plugins-flow-framework-templates
- .plugins-flow-framework-state
plugins.security.ssl.http.enabled_protocols:
- "TLSv1.2"
plugins.security.nodes_dn:
- 'CN=node'
Then I ran
/usr/share/opensearch/plugins/opensearch-security/tools/securityadmin.sh -icl -nhnv \
-cd "/usr/share/opensearch/config/opensearch-security" \
-key "/usr/share/opensearch/config/kirk-key.pem" \
-cert "/usr/share/opensearch/config/kirk.pem" \
-cacert "/usr/share/opensearch/config/root-ca.pem"
After that point, I keep getting errors.
The following makefile generates my keys
keys/root-ca.key:
mkdir -p keys;
openssl genrsa -out keys/root-ca.key 2048;
keys/ca.crt: keys/root-ca.key
openssl req -new -x509 -sha256 -key keys/root-ca.key -out keys/ca.crt -days 730 -subj "/CN=ca.local";
keys/admin.key:
mkdir -p keys;
openssl genrsa -out keys/admin-temp.key 2048;
openssl pkcs8 -inform PEM -outform PEM -in keys/admin-temp.key -topk8 -nocrypt -v1 PBE-SHA1-3DES -out keys/admin.key
rm keys/admin-temp.key;
keys/admin.crt: keys/admin.key keys/ca.crt keys/root-ca.key
openssl req -new -key keys/admin.key -out keys/admin.csr -subj "/CN=admin";
openssl x509 -req -in keys/admin.csr -CA keys/ca.crt -CAkey keys/root-ca.key -CAcreateserial -sha256 -out keys/admin.crt -days 730;
rm keys/admin.csr;
keys/tls.key:
openssl genrsa -out keys/tls-temp.key 2048;
openssl pkcs8 -inform PEM -outform PEM -in keys/tls-temp.key -topk8 -nocrypt -v1 PBE-SHA1-3DES -out keys/tls.key
rm keys/tls-temp.key;
keys/tls.crt: keys/tls.key keys/ca.crt keys/root-ca.key
openssl req -new -key keys/tls.key -out keys/tls.csr -subj "/CN=node";
openssl x509 -req -in keys/tls.csr -CA keys/ca.crt -CAkey keys/root-ca.key -CAcreateserial -sha256 -out keys/tls.crt -days 730;
rm keys/tls.csr;
removeoldkeys:
rm -rf keys;
makekeys: removeoldkeys keys/admin.key keys/admin.crt keys/tls.key keys/tls.crt keys/ca.crt
@echo "Keys are generated.";
I am stuck here for a while, please help! 🙏
I'm seeing errors like this in master node logs:
[2024-06-05T01:05:39,152][INFO ][o.o.s.a.s.DebugSink ] [opensearch-cluster-master-2] AUDIT_LOG: {
"audit_node_id" : "lP5ZYpVDR1O9n8EDWhKe1g",
"audit_request_layer" : "TRANSPORT",
"audit_request_exception_stacktrace" : "javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)\n\tat java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)\n\tat java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:134)\n\tat java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736)\n\tat java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691)\n\tat java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506)\n\tat java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482)\n\tat java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679)\n\tat io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:310)\n\tat io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1445)\n\tat io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)\n\tat io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)\n\tat io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)\n\tat io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)\n\tat io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)\n\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat java.base/java.lang.Thread.run(Thread.java:1583)\nCaused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)\n\tat java.base/sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1864)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:239)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:196)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:159)\n\tat java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)\n\t... 27 more\n",
"@timestamp" : "2024-06-05T01:00:55.484+00:00",
"audit_request_effective_user_is_admin" : false,
"audit_cluster_name" : "opensearch-cluster",
"audit_format_version" : 4,
"audit_node_host_address" : "10.200.2.124",
"audit_node_name" : "opensearch-cluster-master-2",
"audit_category" : "SSL_EXCEPTION",
"audit_request_origin" : "TRANSPORT",
"audit_node_host_name" : "10.200.2.124"
}
Here's the expanded stack trace:
javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:134)
at java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736)
at java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691)
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506)
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482)
at java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679)
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:310)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1445)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
at java.base/sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1864)
at java.base/sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:239)
at java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:196)
at java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:159)
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)
... 27 more
I'm using container image docker.io/opensearchproject/opensearch:2.14.0@sha256:96af4ace999e20f3f74b1675e501d7dba46f2e7c185cfcffd4626898b00e6743 on linux/arm64.
I don't think this is fixed. Could someone please re-open?
same error happened here but what I've done that caused this error was using a Cert with SANS for all my cluster nodes... I've used this kind of Cert for other services without any problems...I hope that you guys fix this issue!
I'm seeing errors like this in master node logs:
[2024-06-05T01:05:39,152][INFO ][o.o.s.a.s.DebugSink ] [opensearch-cluster-master-2] AUDIT_LOG: { "audit_node_id" : "lP5ZYpVDR1O9n8EDWhKe1g", "audit_request_layer" : "TRANSPORT", "audit_request_exception_stacktrace" : "javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)\n\tat java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)\n\tat java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:134)\n\tat java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736)\n\tat java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691)\n\tat java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506)\n\tat java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482)\n\tat java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679)\n\tat io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:310)\n\tat io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1445)\n\tat io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)\n\tat io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)\n\tat io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)\n\tat io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)\n\tat io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)\n\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat java.base/java.lang.Thread.run(Thread.java:1583)\nCaused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)\n\tat java.base/sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1864)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:239)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:196)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:159)\n\tat java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)\n\t... 27 more\n", "@timestamp" : "2024-06-05T01:00:55.484+00:00", "audit_request_effective_user_is_admin" : false, "audit_cluster_name" : "opensearch-cluster", "audit_format_version" : 4, "audit_node_host_address" : "10.200.2.124", "audit_node_name" : "opensearch-cluster-master-2", "audit_category" : "SSL_EXCEPTION", "audit_request_origin" : "TRANSPORT", "audit_node_host_name" : "10.200.2.124" }Here's the expanded stack trace:
javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16) at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:134) at java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736) at java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691) at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506) at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482) at java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679) at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:310) at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1445) at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338) at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16) at java.base/sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1864) at java.base/sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:239) at java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:196) at java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:159) at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ... 27 moreI'm using container image
docker.io/opensearchproject/opensearch:2.14.0@sha256:96af4ace999e20f3f74b1675e501d7dba46f2e7c185cfcffd4626898b00e6743onlinux/arm64.I don't think this is fixed. Could someone please re-open?
Exact same issue here on 2.18.0. Seems to start occurring more frequently when I start shipping logs from fluent-bit, effectively nuking my cluster. A client decimating a server with a faulty TLS handshake seems like a super critical vulnerability to me.
I just installed Graylog and two Datanodes. I'm seeing this issue on one of the datanodes but the other works fine. Does anyone have fix that works? I've tried most of the suggestions above to resolve this but no luck.
Fix for me was to disable hostname verification which is unfortunate (plugins.security.ssl.transport.enforce_hostname_verification:false). Also in my case my cert CN was using a wildcard, so maybe there's a weird matching issue going on. Bc obviously my CN *.test.com won't match my actual hostname opensearch.test.com.
@khamilton59 if you're still having this issue, please post on the Graylog Community forum and we'll try to help as best we can over there!
Brand new cluster, seeing same issue on 2.19.1. Can't run securityadmin.sh because of the issue as well. Get the same stack as OP using
plugins:
security:
ssl:
transport:
enforce_hostname_verification: false
# /usr/share/opensearch/plugins/opensearch-security/tools/securityadmin.sh -cd /etc/opensearch/opensearch-security -icl -h 10.xxx.0.8 -key /etc/opensearch/certs/admin-key-pkcs1.pem -cert /etc/opensearch/certs/admin.pem -cacert /etc/opensearch/certs/root-ca.pem -nhnv
Security Admin v7
Will connect to 10.xxx.0.8:9200 ... done
ERR: An unexpected SSLHandshakeException occured: Received fatal alert: handshake_failure
See https://opensearch.org/docs/latest/clients/java-rest-high-level/ for troubleshooting.
Trace:
javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
See https://opensearch.org/docs/latest/clients/java-rest-high-level/ for troubleshooting.
at org.opensearch.client.RestClient.extractAndWrapCause(RestClient.java:1241)
at org.opensearch.client.RestClient.performRequest(RestClient.java:358)
at org.opensearch.client.RestClient.performRequest(RestClient.java:346)
at org.opensearch.security.tools.SecurityAdmin.execute(SecurityAdmin.java:575)
at org.opensearch.security.tools.SecurityAdmin.main(SecurityAdmin.java:165)
Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)
at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:117)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:353)
at java.base/sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:293)
at java.base/sun.security.ssl.TransportContext.dispatch(TransportContext.java:192)
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:172)
at java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:681)
at java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:636)
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:454)
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:433)
at java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637)
at org.apache.http.nio.reactor.ssl.SSLIOSession.doUnwrap(SSLIOSession.java:279)
at org.apache.http.nio.reactor.ssl.SSLIOSession.decryptData(SSLIOSession.java:505)
at org.apache.http.nio.reactor.ssl.SSLIOSession.isAppInputReady(SSLIOSession.java:548)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:120)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
at java.base/java.lang.Thread.run(Thread.java:829)
However, running a curl works just fine
root@opensearch-master-01:~# curl https://10.xxx.0.8:9200/_cluster/health
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the webpage mentioned above.
# This works just fine
root@opensearch-master-01:~# curl -vvvv --cacert /etc/opensearch/certs/root-ca.pem https://10.xxx.0.8:9200/_cluster/health
* Trying 10.xxx.0.8:9200...
* Connected to 10.xxx.0.8 (10.xxx.0.8) port 9200
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* CAfile: /etc/opensearch/certs/root-ca.pem
* CApath: /etc/ssl/certs
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
* subject: CN=es-master-01
* start date: Mar 24 16:15:42 2025 GMT
* expire date: Mar 24 16:15:42 2026 GMT
* subjectAltName: host "10.xxx.0.8" matched cert\'s IP address!
* issuer: CN=OpenSearch CA
* SSL certificate verify ok.
* Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* Certificate level 1: Public key type RSA (4096/152 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/1.x
> GET /_cluster/health HTTP/1.1
> Host: 10.xxx.0.8:9200
> User-Agent: curl/8.9.1
> Accept: */*
>
* Request completely sent off
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/1.1 401 Unauthorized
< WWW-Authenticate: Basic realm="OpenSearch Security"
< content-type: text/plain; charset=UTF-8
< content-length: 12
<
* Connection #0 to host 10.xxx.0.8 left intact
I got the Insufficient buffer remaining for AEAD cipher fragment error when trying to use the Python client (browser and curl worked normally). After lots of trial and error, it seems the error was related to the SSL certificates not having properly configured extension fields, especially keyUsage and extendedKeyUsage, and the fact that urllib3 doesn't include /etc/ssl/certs/ca-certificates.crt by default.