kafka-go
kafka-go copied to clipboard
Is it possible to manually perform read by a specified offset?
It seems to me that the current implementation lacks the ability to manually perform read request by a specified offset, like
ctx := context.Background()
conn := ...
topic := "my-topic"
partition := 0
offset := 42
msg, err := conn.ReadByOffset(ctx, topic, partition, offset)
Now I have to use kafka.Reader with the following code:
reader := kafka.NewReader(...)
reader.SetOffset(42)
msg := reader.FetchMessage(context.Background())
which requires two rounds of communication (one for updating offsets and one for fetching message) instead of one, as would be possible with the ReadByOffset capability.
Hi @KokorinIlya, it looks to me like SetOffset does not make a remote API call and only makes changes in-memory. Could you clarify what you mean by "two rounds of communication"?
@nlsun, consider two lines from the SetOffset implementation
https://github.com/segmentio/kafka-go/blob/174188e6872cba89043c70b2d54e9b913c3ffacc/reader.go#L1032
https://github.com/segmentio/kafka-go/blob/174188e6872cba89043c70b2d54e9b913c3ffacc/reader.go#L1035
I believe, that these function calls lead to network I/O. Consider, for example, activateReadLag call:
This functions starts a goroutine https://github.com/segmentio/kafka-go/blob/174188e6872cba89043c70b2d54e9b913c3ffacc/reader.go#L1135 that calls another function ReadLag https://github.com/segmentio/kafka-go/blob/174188e6872cba89043c70b2d54e9b913c3ffacc/reader.go#L1146 that definitely involves some network calls https://github.com/segmentio/kafka-go/blob/174188e6872cba89043c70b2d54e9b913c3ffacc/reader.go#L935
@KokorinIlya Oh, I was assuming you were talking about SetOffset being blocked on a network call.
Yes, r.start is starting a reader which makes network calls and r.activateReadLag is gathering lag metrics which makes network calls but both are in goroutines and SetOffset does not wait for them nor does it depend on them.
Were you looking to improve the performance of SetOffset? Or something else?
Hi @KokorinIlya, I think Kafka doesn't offer a search like reading messages by specific offsets. Kafka is message streaming system that works on file systems. To give a better performance on disks, it writes and reads messages in a sequence one by one. Well let me know if Im wrong.