spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-49249][SPARK-49122][Connect][SQL] Add `addArtifact` API to the Spark SQL Core

Open xupefei opened this issue 1 year ago • 2 comments

What changes were proposed in this pull request?

This PR improves Spark SQL Core by adding a bunch of addArtifact APIs to SparkSession. These APIs were first introduced to Spark Connect a while ago.

The follow-up task is for PySpark.

Why are the changes needed?

To close the API compatibility gap between Spark Connect and Spark Classic.

Does this PR introduce any user-facing change?

Yes, users will be able to use some new APIs.

How was this patch tested?

Added new tests.

Was this patch authored or co-authored using generative AI tooling?

No.

xupefei avatar Aug 06 '24 13:08 xupefei

@xupefei can you check if this works with UDFs?

hvanhovell avatar Aug 07 '24 20:08 hvanhovell

@xupefei can you check if this works with UDFs?

I checked and found it doesn't work. The Spark Core session is currently not using the class loader provided by ArtifactManager: https://github.com/apache/spark/blob/11b682cf5b7c5360a02410be288b7905eecc1d28/common/utils/src/main/scala/org/apache/spark/util/SparkClassUtils.scala#L28-L47

For now we have to wrap the code with artifactManager.withResources to replace the class loader. THis issue needs to be fixed.

xupefei avatar Aug 19 '24 10:08 xupefei