strimzi-kafka-operator Improve monitoring

Currently it is difficult to look inside a JVM running in a pod to understand things like garbage collection. The story is inconsistent between different images:

For kafka, zookeeper and kafka connect images we provide the JMX exporter, which if users install prometheus would give then visibility of JVM metrics.
Kafka, zookeeper and kafka connect images have gc logging enabled.
For TC and CC we provide essentially nothing.

From a development perspective it's a lot of faff to have to deploy prometheus and grafana just to see a graph of some metrics. It would be more useful to be able to use a tool like visualvm and kubectl port-forward. There are a couple of issues with doing that:

jstatd, which would give visualvm better insight into the JVM is part of the JDK, not the JRE, so it's difficult to arrange for that tool to be available in the images.

It's possible to use visualvm with JMX (but it's less capable than thru jstatd) by passing System properties to the java cmd:

  -Dcom.sun.management.jmxremote.port=9999 \
  -Dcom.sun.management.jmxremote.rmi.port=9999 \
  -Dcom.sun.management.jmxremote.local.only=false \
  -Dcom.sun.management.jmxremote.authenticate=false \
  -Dcom.sun.management.jmxremote.ssl=false \
  -Djava.rmi.server.hostname=127.0.0.1

but this also requires the port to be EXPOSEd in the Dockerfile, and we probably should make this sufficiently configurable that it supports SSL and authentication.

Ideally whatever we do will be configured consistently across all the images.

May 01 '18 09:05 tombentley

Triaged 22.2.2022: This can be configured using environment variables + the EXPOSE is not needed in the Dockerfile. All what is needed here is to describe it in the (developer) documentation.

Feb 22 '22 15:02 scholzj

This would be a really useful feature. Does it trigger a rolling update? In that case, it couldn't be used with hard to reproduce issues.

Feb 22 '22 15:02 fvaleri

Yes, it requires rolling update if you enable it.

Feb 22 '22 16:02 scholzj

@scholzj you wrote:

This can be configured using environment variables

I assume you meant in this case it'd be the environment variable KAFKA_JMX_OPTS, could you please elaborate what would be the correct way to set it?

Oct 13 '22 01:10 alonpr

++ please elaborate

Jan 13 '23 22:01 aidan-melen

++ here too.

Jun 15 '23 14:06 f3r73ch

strimzi-kafka-operator strimzi-kafka-operator copied to clipboard

Improve monitoring

strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard