confluent-kafka-python
confluent-kafka-python copied to clipboard
Number of file handles does not scale with number of Consumer(s)
Description
The number of open file handles is not proportional with the number of Consumer(s) in a process. I have a simple example where I create N Consumer(s) in a loop. For each N in [1...10] I run lsof | awk '{print $4}' | sort | uniq -c | sort -n to examine the number of open file handles. Below is a paste of the number of file handles for each N [1...10]
1: 531 2: 1386 3: 1848 4: 4068 5: 5896 6: 8046 7: 10518 8: 13320 9: 16443 10: 19890
Furthermore, why is it that > 500 file handles need to be opened for one Consumer?
How to reproduce
import time
import uuid
from confluent_kafka import Consumer, Producer, TopicPartition
from confluent_kafka.admin import AdminClient, NewTopic
if __name__ == "__main__":
conf = {"bootstrap.servers": "INSERT_ANY_BOOTSTRAP",
"enable.auto.commit": False,
"group.id": "kafka-test" + str(uuid.uuid1()),
}
tracker = []
for i in range(10):
topic = str(uuid.uuid1()) # can be any dummy topic name
consumer = Consumer(**conf)
consumer.subscribe([topic])
tracker.append(consumer)
while True:
time.sleep(5)
with
lsof | awk '{print $4}' | sort | uniq -c | sort -n
Checklist
Please provide the following information:
- [1.9.0 and 1.9.0 ] confluent-kafka-python and librdkafka version (
confluent_kafka.version()andconfluent_kafka.libversion()): - [ ] Apache Kafka broker version:
- [ ] Client configuration:
{...} - [ ] Operating system:
- [ ] Provide client logs (with
'debug': '..'as necessary) - [ ] Provide broker log excerpts
- [ ] Critical issue