datafusion-python icon indicating copy to clipboard operation
datafusion-python copied to clipboard

Should we move this library back to Apache Arrow governance?

Open andygrove opened this issue 3 years ago • 6 comments

datafusion-python was donated to the Apache Arrow project in April 2021 and was added to the arrow-datafusion repository [1].

datafusion-python was removed from the repository in January 2022 [2] and added to a new repository in the datafusion-contrib organization.

I would like to propose bringing the Python bindings back under Apache governance. This will require going through the IP clearance process again, unfortunately.

I propose that we move the code to its own repository, perhaps apache/arrow-datafusion-python?

Let's use this issue and the mailing list to discuss this.

[1] https://github.com/apache/arrow-datafusion/pull/69

[2] https://github.com/apache/arrow-datafusion/pull/1518

andygrove avatar Jul 15 '22 22:07 andygrove

I am curious about the rationale to bring it back into apache governance (I am not opposed, but I wonder what the benefits are). Is it related to finding more assistance to maintain the code? Or is it cumbersome to keep up with non trivial changes in DataFusion?

alamb avatar Jul 18 '22 10:07 alamb

One reason is that I would like to help maintain the package. The proposal is not to bring it back into arrow-datafusion but into its own repo arrow-datafusion-python. I don't foresee the move having any impact on DataFusion maintainer's workload.

andygrove avatar Jul 20 '22 10:07 andygrove

One reason is that I would like to help maintain the package.

I see -- as I recall your employment situation requires ASF governed projects, correct?

BTW the move makes sense to me

alamb avatar Jul 20 '22 19:07 alamb

@andygrove only concern that i have with this is if in the future you were unable to contribute as much to the python bindings that the maintenance burden would fall to other arrow contributors who may have less motivation to maintain the python bindings which could slow down progress.

matthewmturner avatar Jul 21 '22 14:07 matthewmturner

I see -- as I recall your employment situation requires ASF governed projects, correct?

It's not quite that simple. There is a process to go through before committing to an open source project and that has already been done for ASF projects so the path is much simpler. In other cases, I have had to add company copyrights to code being submitted when the governance of the project is less clear.

andygrove avatar Jul 22 '22 09:07 andygrove

@andygrove only concern that i have with this is if in the future you were unable to contribute as much to the python bindings that the maintenance burden would fall to other arrow contributors who may have less motivation to maintain the python bindings which could slow down progress.

Yes, that is a good point. I suppose there is always the option to move it back out again but I would prefer to see us work towards having more committers on the project.

andygrove avatar Jul 22 '22 09:07 andygrove