confluent-kafka-python icon indicating copy to clipboard operation
confluent-kafka-python copied to clipboard

Python implementation of Kafka Streams?

Open dalejin2014 opened this issue 8 years ago • 43 comments

We are interested in using kafka streaming. Is it on the road map for confluent kafka python library?

dalejin2014 avatar Aug 29 '16 18:08 dalejin2014

@dalejin2014 We'd love to have native stream processing libraries in different languages and having really good Kafka clients is the basis for that. That said, we don't have a timeline for adding this yet.

ewencp avatar Aug 30 '16 03:08 ewencp

@dalejin2014: As @ewencp mentioned we don't have a timeline yet. The reason for this is that we first want to ensure we have a strong foundation in the form of the Java implementation of Kafka Streams before venturing into non-JVM languages.

That said, of course I took a note of your request. :-)

Do you mind sharing some information about your use case where you'd use Kafka Streams from Python?

miguno avatar Aug 30 '16 10:08 miguno

We are interested in developing a commenting feature kind of like google doc. The use case is as follows:

  • users in a thread should be notified on the events at a frequency of their choosing (realtime, hourly, daily, etc)

So we are thinking about using Kafka Streaming since it provides us:

  • windowing
  • group-by
  • accumulation
  • etc

Is there an easy way to port the features from Java client?

dalejin2014 avatar Aug 30 '16 20:08 dalejin2014

Thanks for sharing the background info @dalejin2014.

Is there an easy way to port the features from Java client?

It's not super-hard but also not trivial. Also, one would need to continuously maintain any such Kafka Streams libraries for other languages with the same commitment and high quality as the current Kafka Streams library for Java, so "porting" is not a one-off effort but an ongoing time investment. Hence our current decision to focus our efforts first on the Java implementation of Kafka Streams.

miguno avatar Aug 31 '16 09:08 miguno

+100 :)

jacqvdm avatar Oct 11 '16 11:10 jacqvdm

Kafka Streams for Python would be so amazing. I'm currently evaluating stream processing frameworks and I like what I've been reading about Kafka Streams. My use case is essentially this: I'm laying down the infrastructure to enable realtime analytics and processing of log/event data. The primary users of this data are data scientists who would be standing up their own Kafka streams apps mostly for doing transformations, joins, partitioning and windowed analytics. I think Kafka streams fits this use case nicely since the streams library eliminates a lot of the boiler plate code involved in configuring Kafka consumers and producers but leaves developers the freedom and flexibility to do lots of cool stuff with the data in each Kafka topic. The only catch is that not many of the data scientists are well versed in Java--our language of choice is Python for almost everything. As much as I like Kafka and as excited as I am about Kafka Streams, getting the data scientists on board with writing Java will be an uphill battle.

With that said, have there been any developments with regards to supporting a Python based Kafka Streams library?

zzbennett avatar Dec 03 '16 01:12 zzbennett

@zzbennett I hear you, Elizabeth. :-)

Unfortunately our short-term roadmap does not include work on a Python library of Kafka Streams. (We'd definitely welcome contributors though!) Same situation for e.g. kafka-python, a community project.

I'm kinda hesitant to suggest this, but perhaps it would be worth a try to experiment with Jython? IIRC some Ruby users have been experimenting with Kafka Streams' Java library via JRuby. FWIW, there are a few community/external projects already working on various "wrappers" (in a broad sense) for Kafka's Streams and Connect APIs, but they haven't been released yet; I don't remember off the top of my hat whether a Python-based one was amongst that.

miguno avatar Dec 05 '16 11:12 miguno

Thanks for your reply @miguno and thanks for the suggestions. Jython might be a good option for prototyping. I may actually be able to drum up support for Scala based Streams apps, which would work a bit better with the Java libraries.

As far as contributing, I may even end up putting together a Python port of Kafka Streams for our uses cases. Eventually with the help of some collaborators in the kafka python community we'd hopefully be able to contribute something upstream. But I suppose we can cross that bridge when we get there. At any rate, thanks again for the help!

zzbennett avatar Dec 06 '16 01:12 zzbennett

@zzbennett Somebody in my group was talking about working on this also. If you create a repo with issues laying out the work and then solicit help, you may find yourself with some contributors reasonably soon.

murphyke avatar Dec 13 '16 22:12 murphyke

@murphyke that would be super. I actually just created a repo last weekend to start working on it (https://github.com/python-kafka-streams/python-kafka-streams). I haven't committed any work or created any tickets yet, but hopefully I'll get a chance to do that in the next couple of days. Feel free to send people over there if they are itching to work on it. Once a little momentum gets built up I'll post to some user groups to solicit help.

zzbennett avatar Dec 13 '16 22:12 zzbennett

@zzbennett I'd love to contribute to the python-kafka-streams repo.

supertramp01 avatar Dec 26 '16 03:12 supertramp01

I would love to work on this, as well as love the idea itself :)

Wondering if someone has some initial design which I can start working with?

ayanguha avatar Mar 22 '17 05:03 ayanguha

so... what's best practice? use Jython?

ghost avatar Jul 08 '17 10:07 ghost

Jython is one option, yes. And some users are actually running Jython-based Kafka Streams applications in production.

Also: There's an upcoming, community-driven Python implementation of Kafka Streams (a first MVP = not all features are already implemented) that will be presented at EuroPython later this month.

miguno avatar Jul 10 '17 07:07 miguno

The code @miguno is referring to is now on GitHub: https://github.com/wintoncode/winton-kafka-streams

Check it out and get involved with the project!

llawall avatar Jul 12 '17 17:07 llawall

no updates for a month on winton, I hope they continue their good project

ghost avatar Aug 14 '17 20:08 ghost

seems dead unfortunately

ghost avatar Sep 16 '17 05:09 ghost

Would be great to have a bit of help from Confluent on this, given python is the most wanted language in 2017 according to Stack Overflow 51eef3d9dcc6a0ca8642a6d58fd182fcb0c8b419

ghost avatar Oct 10 '17 01:10 ghost

@pouledodue: I'd suggest to bring this up at https://github.com/wintoncode/winton-kafka-streams -- the last commit in that project was actually 5 days ago.

miguno avatar Oct 10 '17 12:10 miguno

+1 on this. Question for the community about renaming the projet to a more "standard name": https://github.com/wintoncode/winton-kafka-streams/issues/8

rdehouss avatar Oct 29 '17 00:10 rdehouss

at this point I decided to learn the java ecosystem instead of using an half-baked python solution

ghost avatar Feb 21 '18 20:02 ghost

Are there any developments on this request ? I was so excited about kafka but with no streaming api implementation in python I am unsure now.

g-rd avatar Jun 22 '18 20:06 g-rd

@g-rd, as of today we are still tracking interest but it doesn't currently have a place on the roadmap.

rnpridgeon avatar Jun 22 '18 22:06 rnpridgeon

@g-rd you may look into Apache Pulsar

ghost avatar Jun 23 '18 17:06 ghost

@g-rd Check https://github.com/wintoncode/winton-kafka-streams

edenhill avatar Jun 24 '18 07:06 edenhill

@edenhill I have looked at it already, but it looks to me that this project is either perfect with no developing needed or just not being developed. I go with not being actively developed. I am looking now at Apache Pulsar and I think Pulsar is a better fit for me.

g-rd avatar Jun 24 '18 12:06 g-rd

Check out a Kafka Streams inspired Python Stream Processing library we just open sourced: https://robinhood.engineering/faust-stream-processing-for-python-a66d3a51212d

vineetgoel avatar Jul 31 '18 23:07 vineetgoel

It's been over a year. Any further comment on if Kafka Stream will be available?

bretlowery avatar Aug 28 '19 15:08 bretlowery

We do not have any immediate plans to create a non-java Kafka Streams implementation. Either look into using KSQL or https://github.com/wintoncode/winton-kafka-streams

edenhill avatar Aug 30 '19 11:08 edenhill

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

ZisisFl avatar Jan 15 '20 12:01 ZisisFl