quinn icon indicating copy to clipboard operation
quinn copied to clipboard

DataFrame#transform method added in Spark 3

Open MrPowers opened this issue 5 years ago • 1 comments

The from quinn.extensions import * import statement will run this code.

Should we add a PySpark version check to only run this code if the PySpark version is less than 3?

@nvander1 @afranzi - let me know your thoughts!

MrPowers avatar Jun 06 '20 13:06 MrPowers

Agree. Looking forward for the new spark 3.0.0 release.

afranzi avatar Jun 07 '20 20:06 afranzi

I say we just delete this file. This would be something good to tackle for the 1.0 release.

MrPowers avatar Oct 23 '22 22:10 MrPowers

Is this issue open for contribution ?

NikhilGupta178 avatar Mar 10 '23 17:03 NikhilGupta178

@NikhilGupta178 - yea, there is a suggested solution here: https://github.com/MrPowers/quinn/issues/35#issuecomment-1460146628. Do you want to grab this one?

cc: @SemyonSinchenko

MrPowers avatar Mar 10 '23 17:03 MrPowers

@MrPowers Yeah, I would like to do this. Suggested solution looks like a good fit.

NikhilGupta178 avatar Mar 10 '23 18:03 NikhilGupta178

@NikhilGupta178 - assigned you to the issue. Let us know when you have a PR that's ready for us to review/merge. Thank you!

MrPowers avatar Mar 10 '23 18:03 MrPowers

I think we should do it for all the extensions, not only transform.

P.S. @MrPowers we already discussed this topic that such a pattern is not fir with Python Zen; we should avoid it and maybe delete it in the future versions. What do you think about wrap all the extensions into syntax sugar with DeprecationWarning inside? Like this:

import warnings

def _ext_function(self, f):
    warnings.warn(
        "Extensions may be removed in the future versions of quinn. Please use explicit functions instead",
        category=DeprecationWarning,
        stacklevel=2
    )
    return transform(self, f)

DataFrame.transform = getattr(DataFrame, "transform", _ext_function)

SemyonSinchenko avatar Mar 10 '23 18:03 SemyonSinchenko

@SemyonSinchenko Isn't the above code snippet, a complete solution for this particular issue ?

NikhilGupta178 avatar Mar 10 '23 19:03 NikhilGupta178

@SemyonSinchenko Isn't the above code snippet, a complete solution for this particular issue ?

It is just one of ways how to do it. I didn't run it to ne homest and there may be a typo or syntax error.

SemyonSinchenko avatar Mar 10 '23 19:03 SemyonSinchenko

@SemyonSinchenko - can you open a new issue for the broader discussion?

@NikhilGupta178 - for purposes of this issue, you can just submit a PR that addresses the DataFrame transform extension. Loading it for PySpark 2 and ignoring it for PySpark 3 would be ideal. Thank you!

MrPowers avatar Mar 10 '23 19:03 MrPowers

@MrPowers I created a discussion here #74 I guess we can continue there.

SemyonSinchenko avatar Mar 10 '23 20:03 SemyonSinchenko

@MrPowers - PR created

NikhilGupta178 avatar Mar 12 '23 04:03 NikhilGupta178

The task was done in #81

SemyonSinchenko avatar Mar 31 '23 08:03 SemyonSinchenko