modin
modin copied to clipboard
FEAT-#0000: Implement DataFrame API standard
What do these changes do?
This is an example of how DataFrame API protocol implementation might look like if it will be added directly to Modin, instead of https://github.com/data-apis/dataframe-api-compat.
- [x] first commit message and PR title follow format outlined here
NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.
- [ ] passes
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py - [ ] passes
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py - [ ] signed commit with
git commit -s - [ ] Resolves #?
- [ ] tests added and passing
- [ ] module layout described at
docs/development/architecture.rstis up-to-date
Why we're merging this to our repo instead of https://github.com/data-apis/dataframe-api-compat? Is this already determined that the dataframe-api repo will be abandoned and won't accept contributions?
In particular, I'm asking about this PR (data-apis/dataframe-api-compat#71), I've seen some fresh comments there from you. Are you expecting this to be merged? If so, do we need the PR in modin?
@dchigarev Why we're merging this to our repo instead of https://github.com/data-apis/dataframe-api-compat?
I consider this change as temporary in order to be in time for the 0.27 release.
Is this already determined that the dataframe-api repo will be abandoned and won't accept contributions?
I don't think so.
In particular, I'm asking about this PR (https://github.com/data-apis/dataframe-api-compat/pull/71), I've seen some fresh comments there from you. Are you expecting this to be merged? If so, do we need the PR in modin?
I’m still sure that it can be merged, but it definitely won’t make it in time for tomorrow’s release.
I consider this change as temporary in order to be in time for the 0.27 release.
Why do we want this to be in 0.27 release? cc @YarShev It looks like nothing pushes us to have a working df-api implementation? If someone is interested to try df-api on Modin, we could provide them @anmyachev's branch with the implementation
I consider this change as temporary in order to be in time for the 0.27 release.
why do we want this to be in 0.27 release? cc @YarShev
There is an opportunity to provide these changes earlier.
It is also worth noting that:
- The protocol is still in beta and may change significantly. However, almost all the code can be reused from Pandas, given that the public API is the same.
- Even if we place the code in
dataframe-api-compatrepository, this does not mean that it will be fully supported by other developers. Support is still expected from the Modin developers.
Even if we place the code in dataframe-api-compat repository, this does not mean that it will be fully supported by other developers.
Sure, I also thought the same. I'm just worried about how we're going to sync modin's repos with the PR to dataframes-api (https://github.com/data-apis/dataframe-api-compat/pull/71). The diffs are massive, so applying review suggestions for the dataframes-api PR and possibly fixing bugs for the implementation in the Modin at the same time, could result into a severe unsynchrony between these two implementations.
Don't we want to instead spend our efforts on the dataframes-api PR https://github.com/data-apis/dataframe-api-compat/pull/71 (making a review by our team/working on suggestions together) and the future support of modin's implementation there?
Closed in favour of https://github.com/modin-project/modin/pull/7196