confluent-kafka-dotnet icon indicating copy to clipboard operation
confluent-kafka-dotnet copied to clipboard

Connecting to azure event hub for kafka with workload identity from aks hang and restart the pod

Open andrew-at-v opened this issue 1 year ago • 5 comments

Description

When I try to connect from aks with managed identity, the connection simply hang within OauthHandlerCallback If i compare it with my local development, this log is never called or at least I dont see it written in the log

[thrd:app]: Waking up waiting broker threads after setting OAUTHBEARER token

How to reproduce

  • Try to connect to az event hub from aks with user assigned managed identity
private const string brokerVersionFallback = "1.0.0";
private const int socketTimeoutMs = 60000; // This corresponds to the Consumer config `request.timeout.ms`
private const int sessionTimeoutMs = 30000;
private const int metadataMaxAgeMs = 180000;

BootstrapServers = eventHubNamespace,
SecurityProtocol = SecurityProtocol.SaslSsl,
SaslMechanism = SaslMechanism.OAuthBearer,
SocketTimeoutMs = socketTimeoutMs,
SessionTimeoutMs = sessionTimeoutMs,
GroupId = consumerGroupName,
AutoOffsetReset = autoOffsetReset,
BrokerVersionFallback = brokerVersionFallback,
EnableAutoCommit = autoCommit,
SocketKeepaliveEnable = true,
MetadataMaxAgeMs = metadataMaxAgeMs

Checklist

Please provide the following information:

  • [ ] A complete (i.e. we can run it), minimal program demonstrating the problem. No need to supply a project file.
  • [ x] Confluent.Kafka nuget version. 2.4.0
  • [ ] Apache Kafka version.
  • [ ] Client configuration.
  • [ ] Operating system.
  • [ ] Provide logs (with "debug" : "..." as necessary in configuration).
  • [ ] Provide broker log excerpts.
  • [ ] Critical issue.

andrew-at-v avatar Jul 10 '24 08:07 andrew-at-v

With connection string, it works, it can connect, but for some reason its not producing any message (no error) log simply say Received MetadataResponse, Sent MetadataRequest

andrew-at-v avatar Jul 10 '24 09:07 andrew-at-v

the no producing part could be different issue (not related to kafka)

andrew-at-v avatar Jul 10 '24 11:07 andrew-at-v

There are lot of silent error with this library (which very dangerous) Pod is producing message to event hub kafka now, using connection string

So the issue with workload identity still remain..

andrew-at-v avatar Jul 10 '24 14:07 andrew-at-v

Hi @abratv Can you provide debug logs? That might help in finding the issue.

anchitj avatar Jul 16 '24 12:07 anchitj

We've been seeing something extremely similar when attempting to publish from a dotnet-isolated runtime Functions App (.NET8) on Confluent.Kafka version 2.5.0.

In our case the function that is attempting to do this basically just crashes with no exception (we have tried very hard to catch any possible exceptions).

We haven't yet tried connection strings rather than OAuth, but will probably give that a go next.

Mark-A-Williams avatar Jul 26 '24 16:07 Mark-A-Williams