Weird memory usage compared to JDBC and BigQuery kafka connects
Describe the bug
Hey team, we are using ClickHouse, JDBC and BigQuery Kafka Connects with identical settings and resources:
- 4GB RAM limit and requests in k8s
- 2CPU limit and requests
KAFKA_HEAP_OPTS : -XX:+UseContainerSupport -XX:InitialRAMPercentage=60 -XX:MaxRAMPercentage=80
I found that only ClickHouse has a weird memory usage reported on max:kubernetes.memory.usage_pct metric by growing until 100%, you can see it here, red is CH KC (90%-100%), blue is BQ KC (80%)
Memory pool also looks unstable compared to BigQuery KC (violet - CH, blue - BQ):
This command returns same settings on both containers:
java -XX:+UseContainerSupport -XX:InitialRAMPercentage=60 -XX:MaxRAMPercentage=80 -XshowSettings:VM -version
VM settings:
Max. Heap Size (Estimated): 3.20G
Using VM: OpenJDK 64-Bit Server VM
Property settings:
awt.toolkit = sun.awt.X11.XToolkit
file.encoding = UTF-8
file.separator = /
java.awt.graphicsenv = sun.awt.X11GraphicsEnvironment
java.awt.printerjob = sun.print.PSPrinterJob
java.class.path =
java.class.version = 55.0
java.home = /usr/lib/jvm/zulu11-ca
java.io.tmpdir = /tmp
java.library.path = /usr/java/packages/lib
/usr/lib64
/lib64
/lib
/usr/lib
java.runtime.name = OpenJDK Runtime Environment
java.runtime.version = 11.0.18+10-LTS
java.specification.name = Java Platform API Specification
java.specification.vendor = Oracle Corporation
java.specification.version = 11
java.vendor = Azul Systems, Inc.
java.vendor.url = http://www.azul.com/
java.vendor.url.bug = http://www.azul.com/support/
java.vendor.version = Zulu11.62+17-CA
java.version = 11.0.18
java.version.date = 2023-01-17
java.vm.compressedOopsMode = Zero based
java.vm.info = mixed mode
java.vm.name = OpenJDK 64-Bit Server VM
java.vm.specification.name = Java Virtual Machine Specification
java.vm.specification.vendor = Oracle Corporation
java.vm.specification.version = 11
java.vm.vendor = Azul Systems, Inc.
java.vm.version = 11.0.18+10-LTS
jdk.debug = release
jdk.vendor.version = Zulu11.62+17-CA
line.separator = \n
os.arch = amd64
os.name = Linux
os.version = 5.15.107+
path.separator = :
sun.arch.data.model = 64
sun.boot.library.path = /usr/lib/jvm/zulu11-ca/lib
sun.cpu.endian = little
sun.cpu.isalist =
sun.io.unicode.encoding = UnicodeLittle
sun.java.launcher = SUN_STANDARD
sun.jnu.encoding = UTF-8
sun.management.compiler = HotSpot 64-Bit Tiered Compilers
sun.os.patch.level = unknown
user.dir = /home/appuser
user.home = /home/appuser
user.language = en
user.name = appuser
user.timezone =
Operating System Metrics:
Provider: cgroupv2
Effective CPU Count: 2
CPU Period: 100000us
CPU Quota: 200000us
CPU Shares: 2048us
List of Processors: N/A
List of Effective Processors, 8 total:
0 1 2 3 4 5 6 7
List of Memory Nodes: N/A
List of Available Memory Nodes, 1 total:
0
Memory Limit: 4.00G
Memory Soft Limit: 0.00K
Memory & Swap Limit: 4.00G
Maximum Processes Limit: 629145
openjdk version "11.0.18" 2023-01-17 LTS
OpenJDK Runtime Environment Zulu11.62+17-CA (build 11.0.18+10-LTS)
OpenJDK 64-Bit Server VM Zulu11.62+17-CA (build 11.0.18+10-LTS, mixed mode)
Expected behaviour
ClickHouse Kafka Connect should respect MaxRAMPercentage and stop growing memory usage up to 80%. Do we override some memory settings or have custom ones in ClickHouse Kafka Connect to explain this behavior?
Configuration
Environment
- Kafka-Connect version: 0.0.14
- OS: linux
@nicolae-gorgias I don't believe we have specific memory settings, but one thing I would suggest is updating the version - we've just released 1.0.8 and (especially compared to 0.0.14) there have been a lot of dependencies updated since then 😄
Echoing what @Paultagoras said: the last version uses the latest JDBC driver version - v0.5.0, which had been refactored to reduce the memory footprint. @nicolae-gorgias would you mind upgrading and sharing your feedback? If you see the problem still exists, we would happily investigate.
@Paultagoras @mshustov thanks, I'll do it and come back with feedback!
@mshustov @Paultagoras Interesting, the vertical line is the upgrade moment, it changed the memory pool metric but total memory is still the same. What other metrics can I check to understand which components consume it?
Hmm interesting - I'll have to take another look. We don't control/configure memory usage to my knowledge, that's all handled by Kafka Connect, but maybe there's some flag or something that we need to provide somewhere.
Is total memory still hitting 100%?