quinn icon indicating copy to clipboard operation
quinn copied to clipboard

Figure out how to deal with the PySpark 2 extensions

Open MrPowers opened this issue 5 years ago • 1 comments

The DataFrame#transform extension is useful for PySpark 2 users but should not be run for PySpark 3 users (cause it's built into the API).

When a user runs from quinn.extensions import * we can either use the spark.version variable to programatically skip over modules that shouldn't be imported for Spark 3 or we can design a separate import interface.

I'm still not sure which approach is better.

MrPowers avatar Jul 18 '20 16:07 MrPowers

I am going to switch the project to Python 3 and remove DataFrame#transform.

MrPowers avatar Feb 25 '21 15:02 MrPowers

We can replace DataFrame.transform = transform by something like this:

DataFrame.transform = getattr(DataFrame, "transform", transform)

and it should work in both 2d and 3d versions. I can open a PR with this.

P.S. I can do it for all extensions to avoid such a problems or any unexpected behavior in the future.

SemyonSinchenko avatar Mar 08 '23 13:03 SemyonSinchenko

@SemyonSinchenko - would there be any way for PySpark 2 to be able to import this function, but for the function to error out if a user is using PySpark 3 or greater and tried to import this function? I'd prefer for PySpark 3 users to leverage the built-in function. Sidenote: they updated this particular function in PySpark 3.3, so the 3.3 method signature is different than the 3.1 method signature 🙃

MrPowers avatar Mar 08 '23 14:03 MrPowers

would there be any way for PySpark 2 to be able to import this function, but for the function to error out if a user is using PySpark 3 or greater and tried to import this function?

Thats exactly what my snipped of code will do. If there is an attribute transform in DataFrame it will leave it as is but is there is no such an attribute it will add it. So behavior will depends of version.

SemyonSinchenko avatar Mar 08 '23 14:03 SemyonSinchenko

@SemyonSinchenko - your suggested solution sounds ideal in that case. Can you please send a PR?

MrPowers avatar Mar 08 '23 16:03 MrPowers

@SemyonSinchenko - your suggested solution sounds ideal in that case. Can you please send a PR?

I'll do it.

SemyonSinchenko avatar Mar 08 '23 16:03 SemyonSinchenko

Work was done in #81

SemyonSinchenko avatar Mar 31 '23 08:03 SemyonSinchenko