brod icon indicating copy to clipboard operation
brod copied to clipboard

SASL GSSAPI not working with Kerberos

Open Knappek opened this issue 5 years ago • 14 comments

Hi, I am trying to configure authenitcation to Kafka via Kerberos using https://github.com/ElMaxo/brod_gssapi following the guide in the wiki. When I start my application I get the following error:

     ** (EXIT) [{{"kafka.kerberos-demo.local", 9093}, {{:case_clause, {:error, -4}}, [{:brod_gssapi, :auth, 6, [file: '/opt/sites/rig/deps/brod_gssapi/src/brod_gssapi.erl', line: 74]}, {:kpro_sasl, :auth, 7, [file: 'src/kpro_sasl.erl', line: 38]}, {:kpro_connection, :init_connection, 2, [file: 'src/kpro_connection.erl', line: 240]}, {:kpro_connection, :init, 4, [file: 'src/kpro_connection.erl', line: 170]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}}]

As it reaches line 74 in brod_gssapi plugin, apparently kinit in line 51 works properly which concludes that I properly provide the Kerberos keytab and principal as described in your wiki.

The error {:case_clause, {:error, -4}} indicates a SASL_NOMECH error, which shouldn't happen though as brod_gssapi plugin should take care to provide the GSSAPI mechanism to the sasl_auth library, shouldn't it?

FYI: I am trying this out in a docker-compose: https://github.com/Accenture/reactive-interaction-gateway/tree/kafka-sasl-kerberos-authentication/examples/api-gateway/kafka-kerberos .

And this is where I configure the brod_gssapi plugin: https://github.com/Accenture/reactive-interaction-gateway/blob/kafka-sasl-kerberos-authentication/lib/rig/kafka_config.ex#L65 .

Knappek avatar Apr 03 '20 06:04 Knappek

So, even though we thought the sasl2 gssapi plugin was installed, it really wasn't, so we solved the NOMECH error described above. However: we now see SASL_FAIL (-1) when connecting.

Following the tutorial in the wiki - no SSL, just SASL plain - I do this (sry for Elixir):

keytab = "/tmp/test/priv/rig.key"
principal = "rig/[email protected]"
:brod.start_client([{'host.docker.internal', 9093}], :client1, [{:sasl, {:callback, :brod_gssapi, {:gssapi, keytab, principal}}}])

Which display this:

09:33:05.829 [warn]  :brod_client [#PID<0.220.0>] :client1 is terminating
reason: [{{'host.docker.internal', 9093}, {{:case_clause, {:error, -1}}, [{:brod_gssapi, :auth, 6, [file: '/tmp/test/deps/brod_gssapi/src/brod_gssapi.erl', line: 74]}, {:kpro_sasl, :auth, 7, [file: 'src/kpro_sasl.erl', line: 38]}, {:kpro_connection, :init_connection, 2, [file: 'src/kpro_connection.erl', line: 240]}, {:kpro_connection, :init, 4, [file: 'src/kpro_connection.erl', line: 170]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}}]

In order to make sure the Kerberos config works as expected, I do this:

$ kdestroy
$ klist
klist: No credentials cache found (filename: /tmp/krb5cc_0)
$ klist -kte priv/rig.key
Keytab name: FILE:priv/rig.key
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   1 04/10/20 09:03:55 rig/[email protected] (aes256-cts-hmac-sha1-96)
   1 04/10/20 09:03:55 rig/[email protected] (aes128-cts-hmac-sha1-96)
$ kinit -kt priv/rig.key rig/[email protected]
$ klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: rig/[email protected]

Valid starting     Expires            Service principal
04/10/20 09:34:27  04/11/20 09:34:26  krbtgt/[email protected]
	renew until 04/10/20 09:34:27

Any ideas how we could debug this?

kevinbader avatar Apr 10 '20 09:04 kevinbader

Okay SASL works now (the client node was missing an /etc/hosts entry for the Kafka broker). But we still can't connect - it seems like the Kafka protocol implementation doesn't work.

The error we get:

reason: [{{"kafka.kerberos-demo.local", 9093},
          {{:failed_to_upgrade_to_ssl, :timeout},
           [{:kpro_connection, :maybe_upgrade_to_ssl, 5, [file: 'src/kpro_connection.erl', line: 307]},
            {:kpro_connection, :init_connection, 2, [file: 'src/kpro_connection.erl', line: 227]},
            {:kpro_connection, :init, 4, [file: 'src/kpro_connection.erl', line: 170]},
            {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}
           ]
          }
         }
        ]
...
State: {:state,
        :client1,
        [{"kafka.kerberos-demo.local", 9093}],
        :undefined,
        [],
        :undefined,
        :undefined,
        [ssl: [cacertfile: "/tmp/test/priv/ca.crt", certfile: "/tmp/test/priv/client.crt", keyfile: "/tmp/test/priv/client.key", password: 'test1234'],
         sasl: {:callback, :brod_gssapi, {:gssapi, "/tmp/test/priv/rig.key", "rig/[email protected]"}}
        ],
        :client1
       }

On the broker side if fails here: https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/RequestHeader.java#L89-L94

Observations:

  • The Kafka setup works with Kafka's console-producer and console-consumer.
  • The SASL plain and SASL SSL work

Any pointers on how me might solve this?

kevinbader avatar May 08 '20 08:05 kevinbader

It seems to me that when kpro_connection was trying to perform TLS handshake, the broker has already started waiting for requests.

Maybe double check if port 9093 is indeed listened by an SSL or SASL_SSL listener, but not PLAINTEXT or SASL_PLAINTEXT.

PS. Glad to hear that you have the GSSAPI issue resolved.

zmstone avatar May 08 '20 08:05 zmstone

Hi @zmstone , thanks for your reply. We are pretty sure to use SASL_SSL. This is the Kafka cluster setup (docker-compose) that we are using: https://github.com/Accenture/reactive-interaction-gateway/tree/kafka-sasl-kerberos-authentication/examples/api-gateway/kafka-kerberos-ssl .

We also tried it with a Kafka cluster with SASL_PLAINTEXT configured. That should be possible as well, shouldn't it? When using this setup, we see the following error in our brod_gssapi client

 10:07:02.605 [error] GenServer :client1 terminating
** (stop) [{{"kafka", 9093}, {{:sasl_auth_error, {:error, :einval}}, [{:kpro_sasl, :auth, 7, [file: 'src/kpro_sasl.erl', line: 43]}, {:kpro_connection, :init_connection, 2, [file: 'src/kpro_connection.erl', line: 240]}, {:kpro_connection, :init, 4, [file: 'src/kpro_connection.erl', line: 170]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}}]
    (brod 3.10.0) /tmp/test/priv/deps/brod/src/brod_client.erl:554: :brod_client.ensure_metadata_connection/1
    (brod 3.10.0) /tmp/test/priv/deps/brod/src/brod_client.erl:300: :brod_client.handle_info/2
    (stdlib 3.12.1) gen_server.erl:637: :gen_server.try_dispatch/4
    (stdlib 3.12.1) gen_server.erl:711: :gen_server.handle_msg/6
    (stdlib 3.12.1) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: :init
State: {:state, :client1, [{"kafka", 9093}], :undefined, [], :undefined, :undefined, [sasl: {:callback, :brod_gssapi, {:gssapi, "/tmp/test/priv/secret/rig.key", "rig/[email protected]"}}], :client1}

and the following in the kafka broker logs:

[2020-04-23 16:21:24,524] DEBUG Processor 1 listening to new connection from /172.20.0.1:52672 (kafka.network.Processor)
[2020-04-23 16:21:24,524] DEBUG connections.max.reauth.ms for mechanism=GSSAPI: 0 (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:21:24,524] DEBUG Set SASL server state to HANDSHAKE_OR_VERSIONS_REQUEST during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:21:24,525] DEBUG Handling Kafka request API_VERSIONS during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:21:24,525] DEBUG Set SASL server state to HANDSHAKE_REQUEST during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:21:26,549] DEBUG Failed during authentication: Error parsing request header. Our best guess of the apiKey is: 0 (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:21:26,549] WARN [SocketServer brokerId=0] Unexpected error from /172.20.0.1; closing connection (org.apache.kafka.common.network.Selector)
org.apache.kafka.common.errors.InvalidRequestException: Error parsing request header. Our best guess of the apiKey is: 0
Caused by: java.nio.BufferUnderflowException
	at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151)
	at java.nio.ByteBuffer.get(ByteBuffer.java:715)
	at org.apache.kafka.common.protocol.ByteBufferAccessor.readArray(ByteBufferAccessor.java:53)
	at org.apache.kafka.common.protocol.Readable.readString(Readable.java:37)
	at org.apache.kafka.common.message.RequestHeaderData.read(RequestHeaderData.java:125)
	at org.apache.kafka.common.message.RequestHeaderData.<init>(RequestHeaderData.java:83)
	at org.apache.kafka.common.requests.RequestHeader.parse(RequestHeader.java:93)
	at org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:497)
	at org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:281)
	at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:173)
	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:547)
	at org.apache.kafka.common.network.Selector.poll(Selector.java:483)
	at kafka.network.Processor.poll(SocketServer.scala:890)
	at kafka.network.Processor.run(SocketServer.scala:789)
	at java.lang.Thread.run(Thread.java:748)

For comparison, the successful connection from a kafka-console-producer is

[2020-04-23 16:23:03,071] DEBUG Processor 2 listening to new connection from /172.20.0.5:39272 (kafka.network.Processor)
[2020-04-23 16:23:03,072] DEBUG connections.max.reauth.ms for mechanism=GSSAPI: 0 (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,318] DEBUG Set SASL server state to HANDSHAKE_OR_VERSIONS_REQUEST during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,318] DEBUG Handling Kafka request API_VERSIONS during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,319] DEBUG Set SASL server state to HANDSHAKE_REQUEST during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,326] DEBUG Handling Kafka request SASL_HANDSHAKE during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,326] DEBUG Using SASL mechanism 'GSSAPI' provided by client (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,327] DEBUG Creating SaslServer for kafka/[email protected] with mechanism GSSAPI (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,328] DEBUG Set SASL server state to AUTHENTICATE during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,403] INFO Successfully authenticated client: [email protected]; [email protected]. (org.apache.kafka.common.security.authenticator.SaslServerCallbackHandler)
[2020-04-23 16:23:03,403] DEBUG Authentication complete; session max lifetime from broker config=0 ms, no credential expiration; no session expiration, sending 0 ms to client (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,403] DEBUG Set SASL server state to COMPLETE during authentication (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
[2020-04-23 16:23:03,404] DEBUG [SocketServer brokerId=0] Successfully authenticated with /172.20.0.5 (org.apache.kafka.common.network.Selector)

Knappek avatar May 09 '20 10:05 Knappek

Hi. @Knappek Judging by the crash log posted by @kevinbader, it is quite clear to me that the crash happened during TLS handshake, not during authentication (which is after TLS handshake). The log you shared however is indeed an issue during authentication --- to which I do not have a clue.

zmstone avatar May 09 '20 10:05 zmstone

Thanks a lot for the quick reply. Okay get it, then I will look into the TLS handshake a little bit deeper and verify that certs are all correct etc. But in general, brod_gssapi should also work with SASL_PLAINTEXT, shouldn't it? If so, I have to look into the authentication process again.

Knappek avatar May 09 '20 10:05 Knappek

Yes, brod_gssapi should be able to work with SASL_PLAINTEXT. It uses the socket provided by brod to perform auth steps, it should not need to know if it's communicating over a plaintext or encrypted tcp connection.

It seems you are using confluent-platform-2.12 in your tests ? https://github.com/Accenture/reactive-interaction-gateway/blob/kafka-sasl-kerberos-authentication/examples/api-gateway/kafka-kerberos-ssl/kafka/Dockerfile which kafka version does 2.12 map to ? Can't seem to find it here: https://docs.confluent.io/current/installation/versions-interoperability.html#cp-and-apache-ak-compatibility

zmstone avatar May 09 '20 11:05 zmstone

Yes, brod_gssapi should be able to work with SASL_PLAINTEXT.

That's awesome and probably easier to test without tls enabled.

Yeah versioning was confusing to me as well: We are using Confluent Platform version 5.4 (see this line), confluent-platform-2.12 means we are installing Confluent Platform Packages based on scala version 2.12.

As we are using Confluent Platform version 5.4, there is Kafka 2.4 installed.

Knappek avatar May 09 '20 16:05 Knappek

I am running into a similar issue, where I am upgrading from brod 3.4.0 to newer version of brod 3.9.4 with newer version of kafka_protocol (2.4.0) and brod_gssapi (master 7447ee4).

In the old version, brod_sock performs query_api_versions call after sasl_auth completes, so the sasl socket connection will be done prior to any other interaction with the server except TLS. https://github.com/klarna/brod/blob/3.4.0/src/brod_sock.erl#L205

In the newer version of kpro_connection, query_api_versions call is shifted before sasl_auth interaction (as suggested by kafka protocol doc I believe) https://github.com/klarna/kafka_protocol/blob/master/src/kpro_connection.erl#L240

this changed the expectation of receiving end on broker side. The expected sasl interaction sequence is documented here: https://kafka.apache.org/protocol.html#sasl_handshake

SASL Authentication Sequence The following sequence is used for SASL authentication:

  1. Kafka ApiVersionsRequest may be sent by the client to obtain the version ranges of requests supported by the broker. This is optional.
  2. Kafka SaslHandshakeRequest containing the SASL mechanism for authentication is sent by the client. If the requested mechanism is not enabled in the server, the server responds with the list of supported mechanisms and closes the client connection. If the mechanism is enabled in the server, the server sends a successful response and continues with SASL authentication.
  3. The actual SASL authentication is now performed. If SaslHandshakeRequest version is v0, a series of SASL client and server tokens corresponding to the mechanism are sent as opaque packets without wrapping the messages with Kafka protocol headers. If SaslHandshakeRequest version is v1, the SaslAuthenticate request/response are used, where the actual SASL tokens are wrapped in the Kafka protocol. The error code in the final message from the broker will indicate if authentication succeeded or failed.
  4. If authentication succeeds, subsequent packets are handled as Kafka API requests. Otherwise, the client connection is closed.

For interoperability with 0.9.0.x clients, the first packet received by the server is handled as a SASL/GSSAPI client token if it is not a valid Kafka request. SASL/GSSAPI authentication is performed starting with this packet, skipping the first two steps above.

As the wire protocol regarding sasl_auth changed apparently, these are a few options I can see:

  1. continue to use brod_gssapi (master 7447ee4) as it is: You need to switch off query_api_versions by adding {query_api_versions, false} to config - or any other equivalent way to switch off that. This will essentially force the sasl_auth to go via the "legacy route", and send SASL/GSSAPI client token as the first packet(s) to server, and perform sasl_auth in the old way (without kafka API request wrapper).
  2. sasl_handshake v0 route: In this mode, you can continue to use brod_gssapi as it is, if you have completed sasl_handshake v0 before you call brod_gssapi:auth.
  3. sasl_handshake v1 route: In this mode, brod_gssapi:auth won't work, as all the sasl auth token will need to be wrapped in kafka protocol request (sasl_authenticate v0 format to be exact). You will need to replicate the logic in brod_gssapi with kpro_req_lib:make/3 and then kpro_lib:send_and_recv/5. There is a catch here: for some reason, kpro_lib:encode/2 encodes 0 bytes string as <<-1:32/?INT>> not [<<0:32/?INT>>, <<>>], and Kafka broker doesn't like this (at least during sasl_authenticate call) and reject the sasl_authenticate whenever it receives such a request. I have to override the field to the later manually. Separate issue raised for this. @zmstone do you know what's the reason of this 0 bytes encoding behavior?

shou1dwe avatar Oct 05 '20 02:10 shou1dwe

Hi, we are trying to use brod_gssapi to auth with our kerberized cluster. The brokers have SASL_PLAINTEXT listeners and GSSAPI enabled mechanism. Our kafka is a pretty old 1.0.1 version.

We are experiencing the same errror described in this issue and tried to follow route 1 on @shou1dwe last comment. We have no success with it.

As we see you added a brod_gssapi_v1 module, we tried with it expecting you added it for compatibility with the new sasl_handsake mecanism for kafka. But it still not works.

We tried all calls to sasl_auth and found that it is the call to sasl_client_start the one than give us the infamous -4 (SASL_NOMECH). Also find that we are receiving 'EXTERNAL ANONYMOUS' as reponse from sasl_listmech.

Do you believe there is something we miss? Is the brod_gssapi_v1 module an intended solution?

Still trying to find a solution and cotribute it back.

Thanks,.

ramonpin avatar Mar 03 '22 18:03 ramonpin

Have you installed cyrus-sasl and cyrus-sasl-gssapi packages

Please note, these packages are not available on Mac. It’s too much of a pain to make it work on MacBook.

You also need to make sure your active directory and domains are configured in krb5.conf file on machine you r using. Typically Kafka administrators provides details around it.

Thanks Vikas On Mar 3, 2022, 12:09 -0600, Ramón Pin @.***>, wrote:

Hi, we are trying to use brod_gssapi to auth with our kerberized cluster. The brokers have SASL_PLAINTEXT listeners and GSSAPI enabled mechanism. Our kafka is a pretty old 1.0.1 version. We are experiencing the same errror described in this issue and tried to follow route 1 on @shou1dwe last comment. We have no success with it. As we see you added a brod_gssapi_v1 module, we tried with it expecting you added it for compatibility with the new sasl_handsake mecanism for kafka. But it still not works. We tried all calls to sasl_auth and found that it is the call to sasl_client_start the one than give us the infamous -4 (SASL_NOMECH). Also find that we are receiving 'EXTERNAL ANONYMOUS' as reponse from sasl_listmech. Do you believe there is something we miss? Is the brod_gssapi_v1 module an intended solution? Still trying to find a solution and cotribute it back. Thanks,. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

vikas15bhardwaj avatar Mar 03 '22 18:03 vikas15bhardwaj

Thnks Vikas. We are currently working on linux boxes, similar to our future running environment, and have libgssapi-krb5-2 library installed. Don't known if there is any incompatibility with that environment and the brod_gssapi plugin.

ramonpin avatar Mar 10 '22 09:03 ramonpin

Sorry not sure if libgssapi-krb5-2 includes cyrus-sasl and cyrus-sasl-gssapi but you do need cyrus packages to make it work

sudo yum -y install cyrus-sasl cyrus-sasl-gssapi

Please note, we have tested this on RHEL7 and 8 versions. Not sure which Linux you are using, but depending on the Linux distribution steps may vary for cyrus-sasl and cyrus-sasl-gssapi plug in installation.

vikas15bhardwaj avatar Mar 15 '22 19:03 vikas15bhardwaj

Thanks @vikas15bhardwaj that was the issue. We are using a debian variant and we do not have libsasl2-modules-gssapi-mit installed.The package that allows libsasl2-2 to use GSSAPI as authorization method for Kerberos.

Just for anyone having this problem in a debian variant you need both packages:

  • libsasl2
  • libsasl2-modules-gssapi-mit

There is also a libsasl2-modules-gssapi-heimdal in case you are using a kdc from the heimdal project instead of the MIT version of it.

ramonpin avatar Mar 21 '22 09:03 ramonpin