protocol icon indicating copy to clipboard operation
protocol copied to clipboard

[WIP] add LZ4 kafka support

Open dsebban opened this issue 7 years ago • 9 comments

I would like to add LZ4 support as our production topics uses this compression.

The tests don't pass.

I am pretty sure using LZ4FrameInputStream and LZ4FrameOutputStream is the issue here the lz4 java library does not supply a standard InputStream / OutputStream

This https://github.com/boy0001/FastAsyncWorldedit/blob/master/core/src/main/java/net/jpountz/lz4/LZ4OutputStream.java

looks more like it.

Not sure how to approach this, what do you think @pchlupacek ?

dsebban avatar Aug 06 '18 07:08 dsebban

@dsebban thanks for this.

Regarding the tests, I am not sure how the LZ4 implementation work, but we have to somehow be able to take a byte of compressed data and uncompress it which is essentially ByteVector => Attempt[ByteVector] function. As such it has nothing to do whether this is implemented as InputStream/OutputStream algorithm or not.

One think we have to look carefully though is that stream won't be performing any thread blocking operations, which in case of OS/IS means usually no Input/Output at all.

From the signature of LZ4OutputStream I think that shall be doable, given supplied OutputStream won't be blocking, such as ByteBuffer one.

Given your changes I think they are generally ok, so I suspect the LZ4 implementation needs some tweaks to one used in kafka :-(. You have to see any play with it a bit, I am not expert over LZ4.

pchlupacek avatar Aug 06 '18 08:08 pchlupacek

@dsebban also maybe it should be a worth to look at LZ4Compressor, perhaps that would give you chance to implement that vie standard fs2 Pull/pipe instead of relying on IS/OS types.

pchlupacek avatar Aug 06 '18 09:08 pchlupacek

cool , thanks for the pointers , I will play a bit and try to work out a non blocking implementation of LZ4

dsebban avatar Aug 07 '18 07:08 dsebban

@pchlupacek I ended up importing KafkaLZ4BlockOutputStream.java and KafkaLZ4BlockInputStream.java only in a java src folder from the kafka client, they tweaked it heavily, so reimplementing it seems like a lot of work.

Also v0 messages does not work, let me know how I should proceed and if this approach is not acceptable.

dsebban avatar Aug 14 '18 12:08 dsebban

@dsebban cool. thanks for your work.

I need to check on licence things for the files you have imported. They are apache, and fs2-kafka is MIT so I hope they can be mixed.

I will review this in few days, as we expect release fs2-kafka with 1.0-M4.

pchlupacek avatar Aug 22 '18 04:08 pchlupacek

Any news on this?

cleverdeveloper avatar Jan 21 '19 17:01 cleverdeveloper

@pchlupacek ?

dsebban avatar Jan 27 '19 21:01 dsebban

@dsebban , @cleverdeveloper sorry for being lte on this. Somehow this was missed under my radar. I will take a look on this this week if thats ok ?

pchlupacek avatar Jan 28 '19 05:01 pchlupacek

I did a temporary release of Kafka part of this repository meanwhile with LZ4 stuff included: https://github.com/evolution-gaming/kafka-protocol

cleverdeveloper avatar Feb 11 '19 11:02 cleverdeveloper