confluent-kafka-python
                                
                                 confluent-kafka-python copied to clipboard
                                
                                    confluent-kafka-python copied to clipboard
                            
                            
                            
                        why does Confluent-kafka does not send more than 5k messages with 1MB of payload ?
Description
I am doing some tests, first i sent 5 messages with payload of 1MB, then 50, 500 with same payload, it works well. But when i send 5k messages it throws error as :
`%5|1617340039.942|REQTMOUT|rdkafka#producer-1| [thrd:sasl_plaintext://52.149.147.197:32266/2]: sasl_plaintext://52.149.147.197:32266/2: Timed out ProduceRequest in flight (after 950ms, timeout #0)
%4|1617340039.942|REQTMOUT|rdkafka#producer-1| [thrd:sasl_plaintext://52.149.147.197:32266/2]: sasl_plaintext://52.149.147.197:32266/2: Timed out 1 in-flight, 0 retry-queued, 1 out-queue, 1 partially-sent requests
%3|1617340039.942|FAIL|rdkafka#producer-1| [thrd:sasl_plaintext://52.149.147.197:32266/2]: sasl_plaintext://52.149.147.197:32266/2: 2 request(s) timed out: disconnect (after 302251ms in state UP)
%5|1617340040.949|REQTMOUT|rdkafka#producer-1| [thrd:sasl_plaintext://52.149.147.197:32266/2]: sasl_plaintext://52.149.147.197:32266/2: Timed out ProduceRequest in flight (after 979ms, timeout #0)
%4|1617340040.949|REQTMOUT|rdkafka#producer-1| [thrd:sasl_plaintext://52.149.147.197:32266/2]: sasl_plaintext://52.149.147.197:32266/2: Timed out 1 in-flight, 0 retry-queued, 1 out-queue, 1 partially-sent requests
%3|1617340040.949|FAIL|rdkafka#producer-1| [thrd:sasl_plaintext://52.149.147.197:32266/2]: sasl_plaintext://52.149.147.197:32266/2: 2 request(s) timed out: disconnect (after 1001ms in state UP, 1 identical error(s) suppressed)
Processed 5000 messsages in 303.98 seconds
15.69 MB/s
16.45 Msgs/s
%4|1617340041.663|TERMINATE|rdkafka#producer-1| [thrd:app]: Producer terminating with 897 messages (1100181264 bytes) still in queue or transit: use flush() to wait for outstanding message delivery`
My configs:
`p = Producer({'bootstrap.servers': '10.x.x.x:19092',
    'sasl.username': kafka_user, 'compression.codec':'snappy',
    'sasl.password': kafka_password, 'sasl.mechanisms':'PLAIN', 'security.protocol': 'SASL_PLAINTEXT',
    'message.max.bytes':'1000000000', 'queue.buffering.max.messages': '10000000', 'message.max.bytes' :'1000000000',
    'queue.buffering.max.kbytes': '2147483647', 'queue.buffering.max.ms' : '500', 'queue.buffering.max.messages':'10000000'})`
I have read max message payload can be 1MB, how to proceed for larger payload ? is there anything Iam missing ?
How to reproduce
Checklist
Please provide the following information:
- [ ] confluent-kafka-python and librdkafka version (confluent_kafka.version()andconfluent_kafka.libversion()): latest
- [ ] Apache Kafka broker version:
- [ ] Client configuration: {...}
- [ ] Operating system: ubuntu 18.02
- [ ] Provide client logs (with 'debug': '..'as necessary)
- [ ] Provide broker log excerpts
- [ ] Critical issue
It seems odd that the ProduceRequest are timing out after only one second, are you sure that you are not setting request.timeout.ms or message.timeout.ms?
@edenhill NO, I am not using request.timeout.ms or message.timeout.ms in my configs.
Ah, I think I see what is going on. Your produce rate is too high for the network/cluster causing messages to be queued in the client and when they're eventually transmitted their timeout might be so low that the message times out while in flight to the broker. Your run lasts for 303 seconds, the default message.timeout.ms is 300s, so that sort of makes sense.
If you reduce the producer queue size you will get quicker back pressure (produce() will raise QUEUE_FULL) and you can stop producing until there is room in the queue.
@edenhill okay, i will try with linger_ms=5 ?
No, rather limit queue.buffering.max.kbytes and queue.buffering.max.messages to only allow for say 60 seconds worth of messages. e.g., if your input rate is 1000 messages per second, set queue.buffering.max.messages to 60000.
@edenhill I will try and update.
@edenhill so main thing is i am using image bytes as payload.
Just wondering was the issue resolved, because I am facing something similar and wanted to ask if you ever found a resolution to this issue, thanks!
having the same problem. was this solved?
See my previous comments on setting queue sizes.
 I have same issue.. how to solve this?
I have same issue.. how to solve this?
The main question is already answered by @edenhill. Closing this issue.