strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
Improve monitoring
Currently it is difficult to look inside a JVM running in a pod to understand things like garbage collection. The story is inconsistent between different images:
- For kafka, zookeeper and kafka connect images we provide the JMX exporter, which if users install prometheus would give then visibility of JVM metrics.
- Kafka, zookeeper and kafka connect images have gc logging enabled.
- For TC and CC we provide essentially nothing.
From a development perspective it's a lot of faff to have to deploy prometheus and grafana just to see a graph of some metrics. It would be more useful to be able to use a tool like visualvm
and kubectl port-forward
. There are a couple of issues with doing that:
-
jstatd
, which would givevisualvm
better insight into the JVM is part of the JDK, not the JRE, so it's difficult to arrange for that tool to be available in the images. -
It's possible to use
visualvm
with JMX (but it's less capable than thrujstatd
) by passing System properties to thejava
cmd:-Dcom.sun.management.jmxremote.port=9999 \ -Dcom.sun.management.jmxremote.rmi.port=9999 \ -Dcom.sun.management.jmxremote.local.only=false \ -Dcom.sun.management.jmxremote.authenticate=false \ -Dcom.sun.management.jmxremote.ssl=false \ -Djava.rmi.server.hostname=127.0.0.1
but this also requires the port to be
EXPOSE
d in the Dockerfile, and we probably should make this sufficiently configurable that it supports SSL and authentication.
Ideally whatever we do will be configured consistently across all the images.
Triaged 22.2.2022: This can be configured using environment variables + the EXPOSE
is not needed in the Dockerfile. All what is needed here is to describe it in the (developer) documentation.
This would be a really useful feature. Does it trigger a rolling update? In that case, it couldn't be used with hard to reproduce issues.
Yes, it requires rolling update if you enable it.
@scholzj you wrote:
This can be configured using environment variables
I assume you meant in this case it'd be the environment variable KAFKA_JMX_OPTS, could you please elaborate what would be the correct way to set it?
++ please elaborate
++ here too.