Phillip Cloud
Phillip Cloud
Thanks for making the issue. This is actually a good first issue I think.
@Abhijnan-Bajpai Sorry for the delay! Feel free to work on the issue without being assigned! I've assigned you now though :)
What changes would need to be made to ibis to support this?
@timothydijamco Let's wait to merge this until after the 3.1 release in about 2 weeks if you're okay with that. We're starting the 4.0 effort after that.
@timothydijamco Rebasing is preferred.
@timothydijamco After some more thinking on this, I'm not sure we want to bake in pandas semantics to PySpark. NaNs are not nulls and forcing them be so isn't correct....
@timothydijamco I've updated this PR to globally handle treating NaNs as nulls, so that the behavior is consistent no matter what the expression is. This works by using Spark's `nanvl`...
@timothydijamco We're going to release 3.1 tomorrow (July 22, 2022) so I don't think a fix for this will make it in! However, definitely for 4.0.0 or perhaps a bugfix...
@gerrymanoim That makes sense! Happy to review PRs for this. Is this something that's high priority for y'all?
@timothydijamco Sounds good! I'll review the PR as soon as it's up.