pulsar-client-go
pulsar-client-go copied to clipboard
DisableBatching still go thru the batching logic and AUTO_CONSUME can not consume with correct schema type
Actual behavior
When using go client to produce messages with avro schema, and consume using
pulsar-client consume -st auto_consume -s test topic_name -n 0
I'm getting, messages with string schema
----- got message -----
key:[Ml8=], properties:[], content:{value=, type=class java.lang.String}
----- got message -----
key:[MjNf], properties:[], content:{value=..@, type=class java.lang.String}
This is with or without DisableBatching
enabled.
Expected behavior
This is inconsistent with the python client.
When using python, and batching is enabled, I get the above string typed message.
When batching is disabled, I get avro typed message.
I'm not sure how batching has anything to do with AUTO_CONSUME
's way of inferring schema type. I expect, whether batching enabled or not, I can always get avro typed messages.
Steps to reproduce
-
pulsar-client consume -st auto_consume -s test topic_name -n 0
- use go client to produce avro messages with
DisableBatching=true
orDisableBatching=false
It looks like even if DisableBatching
is set to true, it still uses a batchBuilder
to build a batch, and that batchBuilder
respects all the other batching related configs.
https://github.com/apache/pulsar-client-go/blob/efb102522f7c8b25d13c282512fa5a5fe2f2ae44/pulsar/producer_partition.go#L246
Possibly related issue: https://github.com/apache/pulsar/issues/11288
System configuration
Pulsar version: 2.9.0
@longtengz Messages flushed by the BatchBuilder's through internalFlushCurrentBatch(). Even if the batch is disabled, the message still needs to be added to a batchBuilder. When batch is disabled, here is the code that flushes individual messages at https://github.com/apache/pulsar-client-go/blob/efb102522f7c8b25d13c282512fa5a5fe2f2ae44/pulsar/producer_partition.go#L500
So the current implementation is correct to create a default batch builder when the batch is disabled.
That seems very confusing to me. If batch is disabled, why using a batchBuilder? From the comments in the code, I assume there's a difference in the format of batch messages and single message sent to the broker. If batch is disabled, it shouldn't use the batch format to send the only one message, right? Also, as I mentioned, other batching related configs are also used in the non-batching mode.
Nevertheless, auto_consume doesn't work like python client in non-batching mode.