node-rdkafka icon indicating copy to clipboard operation
node-rdkafka copied to clipboard

Looking for more Collaborators!

Open webmakersteve opened this issue 6 years ago • 45 comments

Hi everybody.

First off, I just want to thank the community for all of the support this repository has gotten. A few years ago when I started working on what would become node-rdkafka, I never could have imagined it getting this big! It wouldn't be nearly as big without all of you - your contributions, your bug reports, and the inspiration to keep me working on it.

This project originated on Blizzard's data team. We built out kafka support in node.js because we were using node for message streaming from HTTP to kafka to be processed by our downstream services. A few months after that, we stopped using node for new projects so we started using the library less. I still supported it as best I could, and we had still had existing services that used it that were made better with every release!

About 6 months ago I moved to another team that doesn't use any node.js, and the amount of time I've had to commit on this repository has gone to its lowest ever. This would be more OK, but my free weekend-time has also been at an all time low.

So, this has all been a long way of saying I need help! I'd like to give some of the responsibility for the management of this project to some of you. If you are interested at all please email me at [email protected]!

I'm hoping to get more active again as best I can to give this project the support it deserves. There's a lot of work to be done on it to make it even better! If that excites you, send me an email!

Thank you everyone for your continued patience. Its totally undeserved, but appreciated :)

webmakersteve avatar Jun 04 '19 02:06 webmakersteve

I think I speak on behalf of a lot of people to say: 🎉thank you for all your effort so far and in the future❤️! So many people, including myself personally and my team professionally, have benefited from all of your hard work, getting this setup in the first place.

I really hope someone with experience in both C++ and Node is willing to take on a contributor role, as especially for high-throughput cases it's hard to beat binding with the excellent librdkafka. Are there any companies / teams out there (apart from Blizzard) with these skillsets that use node-rdkafka to run any production services?

Unfortunately, that excludes myself, as all my C++ experience is just reading and hacking some other people's code. We're only a 2 person indie team, so lacking the resources to address the issues the project has been facing as of late. It pushed us towards adopting a client written 100% in JS for all new stream processing tasks, but without any of the luxuries of librdkafka's throughput, streams, near-mirrored consumer API's, etc.

JaapRood avatar Jun 10 '19 10:06 JaapRood

What kind of help do you need? Do you have a roadmap of things to improve, modernize? Do you know of bottlenecks that need to be optimized? Would you be open to dropping support for unsupported node releases (<8.0)? Would you be open to modernizing tooling (TypeScript, eslint, prettier commit hooks)?

rusty0412 avatar Jul 10 '19 05:07 rusty0412

@rusty0412

There isn't specifically a road-map. So far there has been very little process involved in what we are upgrading or doing at any given time. Previously it was driven by what I was having to work with at the time (since I was actually using the library), but nowadays I've done very little to actually push the library forward and rely on just merging PRs that come in and providing feedback.

Here is what I'd like to see done in the future:

  1. Drop all support for < 8
  2. Add promise compatible APIs for async functions, or just an easy Promisify wrapper available to wrap all the method calls so they can be used with async await
  3. Upgrade NAN and get rid of all of the deprecations we have. This is a pretty big effort as the async worker handling has been hugely changed.
  4. A lot of C++ refactoring - the code is very old and looks it. There is a lot of stuff that can be consolidated to reduce the huge amount of sprawl.
  5. Get rid of the bundled librdkafka as part of the base package in favor of having a dependency on it being provided as a shared library in the box. I can be convinced that this is a bad idea, but so many of the problems with this library deal with the fact that it is built from source at install time.
  6. First-class typescript support
  7. A lot of CI improvements. Maybe migrate to circle CI? But mainly I want the builder to publish the package on every master build with pre-release tags and auto publish tagged builds.
  8. Better testing. Need to make a mocked version of the thin librdkafka classes and inject those around.
  9. Continue implementing new features of librdkafka. There are ways we can do FD polling for more efficient consumption in streams, etc.

If this sounds interesting to you, let me know! But I can also announce that I've added @iradul, @ankon , and @codeburke as collaborators, so thank you guys for volunteering! I appreciate all the help.

webmakersteve avatar Jul 23 '19 21:07 webmakersteve

Looking forward to collaborating on this project @webmakersteve, @iradul, @ankon!

I'm not going to be much help in the C/C++ world, but willing to help how I can. I can definitely start chipping away on some modernization efforts like dropping <8 support, async/await/Promise support, and improvements to CI. Seems like we should open an Issue for each of these to start discussion about what we are planning on changing and let people provide feedback/PRs.

codeburke avatar Jul 23 '19 23:07 codeburke

But I can also announce that I've added @iradul, @ankon , and @codeburke as collaborators, so thank you guys for volunteering! I appreciate all the help.

Thank you for adding us to the team!

  1. First-class typescript support

Should we consider rewriting the whole project in TypeScript? I think that would bring huge benefits.

Also, it might be helpful if we start using Projects feature to manage and track progress of all the things you'd like to see done in the future.

iradul avatar Jul 25 '19 18:07 iradul

@webmakersteve I think the first thing needed is a more direct line of communication; getting replies is very sporadic and puts the package at high risk. For example right now it's just broken in 2.71, but there's no quick way to reach you - since there is no one else, or very very few, who can merge and kick off new releases - so teams around the world that rely on this package must wait for long period of time. We need to communicate. Lets solve this first.

How about a gitter channel? I've created this: https://gitter.im/node-rdkafka/community Lets join.

IdanAdar avatar Jul 25 '19 19:07 IdanAdar

Good to meet you Ivan!

Love the idea of using Project to help coordinate bigger changes.

I’d love to hear the advantages of entirely Typescript. Personally I’m not sold on “strong typing” on the internals of a project, but would love to look into how to use Typescript more consistently on all the public interfaces. Working on interface definitions and getting those linked to documentation seems like a big win for the users. I know that I’ve had to sift through the source code to understand some apis recently.

Ryan

On Thu, Jul 25, 2019 at 2:09 PM Idan Adar [email protected] wrote:

I think the first thing needed is a more direct line of communication; getting replies is very sporadic and puts the package at high risk. For example right now it's just broken in 2.71, but there's no quick way to reach you - since there is no one else who can merge - so teams around the world that rely on this package must wait for long period of time. We need to communicate. Lets solve this first.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Blizzard/node-rdkafka/issues/628?email_source=notifications&email_token=AA4MURHFXV53AFNLUOYVS5DQBH257A5CNFSM4HSXN6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD22PKXY#issuecomment-515175775, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4MURGGPWVURXKADNDPZG3QBH257ANCNFSM4HSXN6UA .

codeburke avatar Jul 25 '19 23:07 codeburke

Hi, I am interested in contributing to a TypeScript migration. This project has no need for any dynamic typing behavior and the library needs to be robust. Static type checking will drive robustness since it is a type of test. It will help contributors understand the code layout. TypeScript will allow the project to use more modern syntax than allowed by supported runtimes. It will also guarantee high quality, error-free typing declarations to library users. If it's not TypeScript native, the declarations and API will drift and become inconsistent.

SoyYoRafa avatar Jul 26 '19 03:07 SoyYoRafa

Forgive my naivety on Typescript for libraries. Will that mean that library users will all need to transpile? I can go along with moving to Typescript for the reasons you mentioned, as long as we support native ES6 syntax in the released assets. I don’t think that we want to require the library users to setup Typescript transpiling of their projects just to use node-rdkafka.

On Thu, Jul 25, 2019 at 10:24 PM Rafael Fernández [email protected] wrote:

Hi, I am interested in contributing to a TypeScript migration. This project has no need for any dynamic typing behavior and the library needs to be robust. Static type checking will drive robustness since it is a type of test. It will help contributor understand the code layout. It will allow the project to use more modern syntax than allowed by supported runtimes. TypeScript native would also guarantee high quality, error-free typing declarations to library users. If it's not native, the declarations and API will drift and become inconsistent.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Blizzard/node-rdkafka/issues/628?email_source=notifications&email_token=AA4MURDHLC4BWRPP4ZP7RJTQBJU75A5CNFSM4HSXN6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD23MF3A#issuecomment-515293932, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4MURB7NJMTKBQI3KQFATTQBJU75ANCNFSM4HSXN6UA .

codeburke avatar Jul 26 '19 03:07 codeburke

Will that mean that library users will all need to transpile?

No. node-rdkafka's build script would invoke the TypeScript compiler. The TypeScript compiler would then generate javascript files and type declarations. node-rdkafka would publish the javascript files and type declarations files. Library users just run the javascript files, they are unaware the project is written in TypeScript. TypeScript users also run the javascript files but will use the generated type declaration files in their code.

Additionally, you can use syntax newer than es6 in TypeScript and configure the compiler to generate es6-compatible javascript. When each version of node gets deprecated, you just bump up the target in the TypeScript compiler config and you get the native code for that target.

SoyYoRafa avatar Jul 26 '19 04:07 SoyYoRafa

From my experience to contributing to node-rdkafka, I'd have to underline this mention of @webmakersteve:

  1. Upgrade NAN and get rid of all of the deprecations we have. This is a pretty big effort as the async worker handling has been hugely changed.
  2. A lot of C++ refactoring - the code is very old and looks it. There is a lot of stuff that can be consolidated to reduce the huge amount of sprawl.
  3. Get rid of the bundled librdkafka as part of the base package in favor of having a dependency on it being provided as a shared library in the box. I can be convinced that this is a bad idea, but so many of the problems with this library deal with the fact that it is built from source at install time.

Those are the definitely the points causing a bunch of problems at the moment, the ones that have been causing stability issues thus far and threaten to do so even more in the future. The binding to librdkafka is what sets node-rdkafka apart with a unique value proposition: piggy backing of an very highly optimized native library that achieves order of magnitude higher throughput than the official Kafka client. That's the bit worth trying to save, imo.

It's not that the Javascript side couldn't do with some love, it definitely could. But if what you're looking for is a modern JS code base, Typescript support, Promise support, strong test suite and CI, I wonder if you're time is not better spent on the newer KafkaJS project, which already has all of those things and is at a high rate of further development and progress. It's what we as a tiny 2-person indie team had to move to, and since we don't have much C++/C expertise, it's 100% Node codebase makes our contributions a whole bunch easier and more productive.

That's not to say that I think node-rdkafka shouldn't be kept alive / revived, not at all! Given it's bindings to librdkafka I could see it as a simpler client that's useful if you need some really serious throughput; it could offer levels of performance hard to match with a fully Javascript-implemented client. But then the focus should at least be on that part, that unique strength. Otherwise, all we're doing is fragmenting the already small Node + Kafka ecosystem even further and just duplicating efforts :(

JaapRood avatar Jul 26 '19 08:07 JaapRood

Speaking for myself, the reason I focused on the JS portion was because that is where my skillset primarily resides and where I feel I could provide immediate progress. But I do 100% agree that the differentiator here is the use of librdkafka as the performant underlying implementation. As I mentioned, I'm more than happy to help where I can in the C++ layer, but would be good if someone with much more experience could help lead that design and guidance on implementation. Is that you Jaap?

I'm curious people's thoughts on moving toward N-API instead NAN? From what I understand that is the C++-binding layer for Node 8+.

Personally, I'm excited to get started working with ya'll to expand my knowledge in other areas and push this project forward.

Ryan

On Fri, Jul 26, 2019 at 3:19 AM Jaap van Hardeveld [email protected] wrote:

From my experience to contributing to node-rdkafka, I'd have to underline this mention of @webmakersteve https://github.com/webmakersteve:

  1. Upgrade NAN and get rid of all of the deprecations we have. This is a pretty big effort as the async worker handling has been hugely changed.
  2. A lot of C++ refactoring - the code is very old and looks it. There is a lot of stuff that can be consolidated to reduce the huge amount of sprawl.
  3. Get rid of the bundled librdkafka as part of the base package in favor of having a dependency on it being provided as a shared library in the box. I can be convinced that this is a bad idea, but so many of the problems with this library deal with the fact that it is built from source at install time.

Those are the definitely the points causing a bunch of problems at the moment, the ones that have been causing stability issues thus far and threaten to do so even more in the future. The binding to librdkafka is what sets node-rdkafka apart with a unique value proposition: piggy backing of an very highly optimized native library that achieves order of magnitude higher throughput than the official Kafka client. That's the bit worth trying to save, imo.

It's not that the Javascript side couldn't do with some love, it definitely could. But if what you're looking for is a modern JS code base, Typescript support, Promise support, strong test suite and CI, I wonder if you're time is not better spent on the newer KafkaJS project https://kafka.js.org/, which already has all of those things and is at a high rate of further development and progress. It's what we as a tiny 2-person indie team had to move to, and since we don't have much C++/C expertise, it's 100% Node codebase makes our contributions a whole bunch easier and more productive.

That's not to say that I think node-rdkafka shouldn't be kept alive / revived, not at all! Given it's bindings to librdkafka I could see it as a simpler client that's useful if you need some really serious throughput: it could offer levels of performance hard to match with a fully Javascript-implemented client. But then the focus should at least be on that part, that unique strength. Otherwise, all we're doing is fragmenting the already small Node + Kafka ecosystem even further and just duplicating efforts :(

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Blizzard/node-rdkafka/issues/628?email_source=notifications&email_token=AA4MURFZF6UPRACGB7SSJ4DQBKXRTA5CNFSM4HSXN6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD233ZGY#issuecomment-515357851, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4MURB6ZZ6FLZJ2RP246W3QBKXRTANCNFSM4HSXN6UA .

codeburke avatar Jul 26 '19 14:07 codeburke

@codeburke unfortunately I can't be, through lack of experience with that side of the code base as well:

Unfortunately, that excludes myself, as all my C++ experience is just reading and hacking some other people's code. We're only a 2 person indie team, so lacking the resources to address the issues the project has been facing as of late.

JaapRood avatar Jul 26 '19 15:07 JaapRood

We can expect that users' runtimes will support N-API by January 2020 when Node 8 is deprecated. N-API guarantees API stability and ABI compatibility across major versions of node. So, moving to N-API has the big benefit of being able to compile node-rdkafka in node 10 and run it in future major versions of node (12, 14, etc) without recompiling the shared objects. As a result, I don't think upgrading NAN is worthwhile. Separately, we should use node-addon-api, not N-API directly. N-API is a low level C interface. node-addon-api is the C++, object oriented wrapper on N-API. So, around November/December would be a good time to start working on migrating to node-addon-api.

SoyYoRafa avatar Jul 27 '19 14:07 SoyYoRafa

Thumbs up for N-API. Looks like it does support Node 8 according to this: https://nodejs.org/docs/latest/api/n-api.html#n_api_n_api_version_matrix Also, since librdkafka is a C library and N-API is C API it would make sense to use C API instead of N-API C++ wrapper.

iradul avatar Jul 28 '19 00:07 iradul

@iradul node-rdkafka uses the C++ api of librdkafka. NAN->node-addon-api is a simple mechanical change. NAN->N-API is a rewrite. N-API is marked experimental in node 8. I am not sure if its safe to use before node 8.11.2, when N-APIv3 was backported.

SoyYoRafa avatar Jul 28 '19 03:07 SoyYoRafa

Mechanical change or not it's still a rewrite unless it can be done in some safe and automated way. I'm sure it would be somewhat easier to rewrite using N-API C++ wrapper but in a long run C API seems better choice because librdkafa is a C library and librdkafka C++ always lags behind it.

iradul avatar Jul 28 '19 06:07 iradul

Regardless of label, what I wanted to communicate was that NAN->node-addon-api is pretty simple. I have already done it. I did it manually but there is a conversion tool that automates a portion of the conversion, https://github.com/nodejs/node-addon-api/tree/master/tools.

Re: using the librdkafka C interface, we would have to build an object oriented wrapper on the librdkafka C API either in C++ or JavaScript/TS. Do you think it's more likely that this project will be ahead of librdkafka's own C++ wrapper or that this project lags the librdkafka C++ wrapper? Also, why not just contribute to librdkafka C++ wrapper?

SoyYoRafa avatar Jul 28 '19 15:07 SoyYoRafa

May I suggest opening an issue and moving the discussion to there? In this issue we need to find out who wants to collaborate and become an active, or at least more-available maintainer. Without this this package is going nowhere.

I am less proficient than you guys, but I can be highly available and utilize your expertise with the the more complicated PRs. Anyone else willing to chime in?

@webmakersteve It's also up to you to add more maintainers...

IdanAdar avatar Jul 28 '19 17:07 IdanAdar

+1

I think we should move all these major initiatives (Typescript, node 8+ only, NAN->N-API, C++ refactor, etc) to Issues and/or Project. That way we get maximum engagement and develop with library consumers input. Ideally we could have one collaborator/maintainer heading up each initiative just to keep it on track and accountable.

Ryan

On Sun, Jul 28, 2019 at 12:19 PM Idan Adar [email protected] wrote:

May I suggest opening an issue and moving the discussion to there? In this issue we need to find out who wants to collaborate and become an active, or at least more-available maintainer. Without this this package is going nowhere.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Blizzard/node-rdkafka/issues/628?email_source=notifications&email_token=AA4MURCIJYB37CBJIWHHBB3QBXIKTA5CNFSM4HSXN6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD27CZSQ#issuecomment-515779786, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4MURCV7ERWT2FJV6AWUODQBXIKTANCNFSM4HSXN6UA .

codeburke avatar Jul 28 '19 17:07 codeburke

Do you think it's more likely that this project will be ahead of librdkafka's own C++ wrapper or that this project lags the librdkafka C++ wrapper? Also, why not just contribute to librdkafka C++ wrapper?

Yes, for example it happened with headers support. C library got it on Jan 4, 2018 and C++ on Nov 26, 2018.

iradul avatar Jul 29 '19 02:07 iradul

I'd be happy to lend support where I can, internally in our team we use this and I've got a good bit of experience in C/C++ although not so much in NAN and primarily my development language at the moment is NodeJS, it would be good to get some people on board for the NodeJS side of things and I could assist in the C++ side of things as much as I can.

davidtrihy avatar Aug 07 '19 12:08 davidtrihy

Hey ya'll,

Just wanted to resurrect this thread and see if there was any movement or plan (coordinated or otherwise)?

@Stephen Parente [email protected] I see there are a few outstanding PRs. In particular there is one that is blocking node 12 testing and adoption that I'd personally love to see get merged and released for my own projects. Any chance you (or we) can get some of those PRs merged and 2.7.2 released?

Ryan

On Wed, Aug 7, 2019 at 7:11 AM David Trihy [email protected] wrote:

I'd be happy to lend support where I can, internally in our team we use this and I've got a good bit of experience in C/C++ although not so much in NAN and primarily my development language at the moment is NodeJS, it would be good to get some people on board for the NodeJS side of things and I could assist in the C++ side of things as much as I can.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Blizzard/node-rdkafka/issues/628?email_source=notifications&email_token=AA4MURH5N7Z54FKPVBKBRR3QDK3WDA5CNFSM4HSXN6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3YF2NI#issuecomment-519068981, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4MURED23XXEV2ZEODGB5DQDK3WDANCNFSM4HSXN6UA .

codeburke avatar Sep 12 '19 13:09 codeburke

Just wanted to resurrect this thread and see if there was any movement or plan (coordinated or otherwise)?

It seems publishing to NPM is still a manual task, so I don't know if merging existing PRs would be helpful to anyone. I guess we're still waiting for @webmakersteve to coordinate us, collaborators.

iradul avatar Sep 12 '19 21:09 iradul

Looks like Ivan has some collaborator permissions now! Glad to see some progress in getting some more collaborators. Not sure who else has access, but either way we should start formalizing the list of things to start working on.

Ryan

On Thu, Sep 12, 2019 at 4:19 PM Ivan [email protected] wrote:

Just wanted to resurrect this thread and see if there was any movement or plan (coordinated or otherwise)?

It seems publishing to NPM is still a manual task, so I don't know if merging existing PRs would be helpful to anyone. I guess we're still waiting for @webmakersteve https://github.com/webmakersteve to coordinate us, collaborators.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Blizzard/node-rdkafka/issues/628?email_source=notifications&email_token=AA4MURBBLW3HTKL3LF24CZLQJKW7JA5CNFSM4HSXN6UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6TJMSY#issuecomment-531011147, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4MURAHOTGAAX47NTOHNM3QJKW7JANCNFSM4HSXN6UA .

codeburke avatar Oct 17 '19 19:10 codeburke

@Iradul Does this mean you also received permissions to publish a new release to npm?

noderat avatar Oct 17 '19 21:10 noderat

No, there was something promising in .travis.yml but it turns out it's not working (#700)

iradul avatar Oct 18 '19 20:10 iradul

I really hope we can have maintainers with access to publish npm packages instead of having to create forks.

@iradul, have you been in contact with @webmakersteve recently? If there has been no response for a while maybe we should consider a fork to back as a community before everyone starts to fork and publish their own versions.

uatuko avatar Oct 23 '19 17:10 uatuko

I support a fork at this stage. 2019 has been a lost year for this repository.

SoyYoRafa avatar Oct 24 '19 05:10 SoyYoRafa

Our company is also currently pondering what to do with our Kafka library usage, basically whether there is a fork the community will back or to migrate to the new KafkaJS project and accept lower performance. For this project it seems best to try and create a mainstream fork at this point.

Tapppi avatar Oct 28 '19 10:10 Tapppi