confluent-kafka-dotnet
confluent-kafka-dotnet copied to clipboard
Connecting to azure event hub for kafka with workload identity from aks hang and restart the pod
Description
When I try to connect from aks with managed identity, the connection simply hang within OauthHandlerCallback If i compare it with my local development, this log is never called or at least I dont see it written in the log
[thrd:app]: Waking up waiting broker threads after setting OAUTHBEARER token
How to reproduce
- Try to connect to az event hub from aks with user assigned managed identity
private const string brokerVersionFallback = "1.0.0";
private const int socketTimeoutMs = 60000; // This corresponds to the Consumer config `request.timeout.ms`
private const int sessionTimeoutMs = 30000;
private const int metadataMaxAgeMs = 180000;
BootstrapServers = eventHubNamespace,
SecurityProtocol = SecurityProtocol.SaslSsl,
SaslMechanism = SaslMechanism.OAuthBearer,
SocketTimeoutMs = socketTimeoutMs,
SessionTimeoutMs = sessionTimeoutMs,
GroupId = consumerGroupName,
AutoOffsetReset = autoOffsetReset,
BrokerVersionFallback = brokerVersionFallback,
EnableAutoCommit = autoCommit,
SocketKeepaliveEnable = true,
MetadataMaxAgeMs = metadataMaxAgeMs
Checklist
Please provide the following information:
- [ ] A complete (i.e. we can run it), minimal program demonstrating the problem. No need to supply a project file.
- [ x] Confluent.Kafka nuget version. 2.4.0
- [ ] Apache Kafka version.
- [ ] Client configuration.
- [ ] Operating system.
- [ ] Provide logs (with "debug" : "..." as necessary in configuration).
- [ ] Provide broker log excerpts.
- [ ] Critical issue.
With connection string, it works, it can connect, but for some reason its not producing any message (no error) log simply say Received MetadataResponse, Sent MetadataRequest
the no producing part could be different issue (not related to kafka)
There are lot of silent error with this library (which very dangerous) Pod is producing message to event hub kafka now, using connection string
So the issue with workload identity still remain..
Hi @abratv Can you provide debug logs? That might help in finding the issue.
We've been seeing something extremely similar when attempting to publish from a dotnet-isolated runtime Functions App (.NET8) on Confluent.Kafka version 2.5.0.
In our case the function that is attempting to do this basically just crashes with no exception (we have tried very hard to catch any possible exceptions).
We haven't yet tried connection strings rather than OAuth, but will probably give that a go next.