pulsar-client-go icon indicating copy to clipboard operation
pulsar-client-go copied to clipboard

DisableBatching still go thru the batching logic and AUTO_CONSUME can not consume with correct schema type

Open longtengz opened this issue 3 years ago • 2 comments

Actual behavior

When using go client to produce messages with avro schema, and consume using

pulsar-client consume -st auto_consume -s test topic_name -n 0

I'm getting, messages with string schema

----- got message -----                  
key:[Ml8=], properties:[], content:{value=, type=class java.lang.String}
----- got message ----- 
key:[MjNf], properties:[], content:{value=..@, type=class java.lang.String}

This is with or without DisableBatching enabled.

Expected behavior

This is inconsistent with the python client.

When using python, and batching is enabled, I get the above string typed message.

When batching is disabled, I get avro typed message.

I'm not sure how batching has anything to do with AUTO_CONSUME's way of inferring schema type. I expect, whether batching enabled or not, I can always get avro typed messages.

Steps to reproduce

  1. pulsar-client consume -st auto_consume -s test topic_name -n 0
  2. use go client to produce avro messages with DisableBatching=true or DisableBatching=false

It looks like even if DisableBatching is set to true, it still uses a batchBuilder to build a batch, and that batchBuilder respects all the other batching related configs.

https://github.com/apache/pulsar-client-go/blob/efb102522f7c8b25d13c282512fa5a5fe2f2ae44/pulsar/producer_partition.go#L246

Possibly related issue: https://github.com/apache/pulsar/issues/11288

System configuration

Pulsar version: 2.9.0

longtengz avatar Dec 08 '21 18:12 longtengz

@longtengz Messages flushed by the BatchBuilder's through internalFlushCurrentBatch(). Even if the batch is disabled, the message still needs to be added to a batchBuilder. When batch is disabled, here is the code that flushes individual messages at https://github.com/apache/pulsar-client-go/blob/efb102522f7c8b25d13c282512fa5a5fe2f2ae44/pulsar/producer_partition.go#L500

So the current implementation is correct to create a default batch builder when the batch is disabled.

zzzming avatar Dec 10 '21 21:12 zzzming

That seems very confusing to me. If batch is disabled, why using a batchBuilder? From the comments in the code, I assume there's a difference in the format of batch messages and single message sent to the broker. If batch is disabled, it shouldn't use the batch format to send the only one message, right? Also, as I mentioned, other batching related configs are also used in the non-batching mode.

Nevertheless, auto_consume doesn't work like python client in non-batching mode.

longtengz avatar Dec 11 '21 03:12 longtengz