spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-49249][SPARK-49320] Add new tag-related APIs in Connect back to Spark Core

Open xupefei opened this issue 1 year ago • 5 comments

What changes were proposed in this pull request?

This PR adds several new tag-related APIs in Connect back to Spark Core. Following the isolation practice in the original Connect API, the newly introduced APIs also supports isolation:

  • interrupt{Tag,All,Operation} can only cancel jobs created by this Spark session.
  • {add,remove}Tag and {get,clear}Tags only apply to jobs created by this Spark session.

Unlike related APIs in SparkContext, All the above APIs are blocking, which means that the caller thread is blocked while jobs are being cancelled.

Why are the changes needed?

To close the API gap between Connect and Core.

Does this PR introduce any user-facing change?

Yes, Core users can use some new APIs.

How was this patch tested?

New test added.

Was this patch authored or co-authored using generative AI tooling?

No.

xupefei avatar Aug 20 '24 12:08 xupefei

Could we file a JIRA for Python API set too? Just to make sure we don't miss it out

HyukjinKwon avatar Aug 21 '24 04:08 HyukjinKwon

Could we file a JIRA for Python API set too? Just to make sure we don't miss it out

Done! https://issues.apache.org/jira/browse/SPARK-49337

xupefei avatar Aug 21 '24 14:08 xupefei

@HyukjinKwon @hvanhovell This PR is now ready for review. Could you take a look? Thanks!

xupefei avatar Aug 23 '24 14:08 xupefei

I feel like what you're doing here is similar with JobArtifactSet. It has things to do with SparkContext but we separated them to JobArtifactSet with a state so we can decouple Spark core from Spark SQL.

HyukjinKwon avatar Aug 26 '24 09:08 HyukjinKwon

I feel like what you're doing here is similar with JobArtifactSet. It has things to do with SparkContext but we separated them to JobArtifactSet with a state so we can decouple Spark core from Spark SQL.

Yes exactly. Basically the equivalent of JobArtifactSet.withActiveJobArtifactState is SparkSession.withActive.

xupefei avatar Aug 26 '24 10:08 xupefei

LGTM. I left a few minor comments. Let me know if you want to address now, or in a follow-up? Two follow-ups here: We need to add this pyspark, and we need to homogenize this with the connect implementation.

I'll address most comments in this PR. Currently, I am being distracted by something else, but will come back very soon.

xupefei avatar Sep 06 '24 18:09 xupefei

Merging to master.

hvanhovell avatar Sep 18 '24 03:09 hvanhovell

FYI, there are two open JIRA issues in the interrupt and cancellation area. Screenshot 2024-09-18 at 07 39 14

dongjoon-hyun avatar Sep 18 '24 14:09 dongjoon-hyun

Ping once more, @xupefei and @hvanhovell . Could you fix the flakiness or disable it (if you are busy), please?

  • https://github.com/apache/spark/actions/runs/11353570330/job/31654735430
SparkSessionJobTaggingAndCancellationSuite:
...
- Cancellation APIs in SparkSession are isolated *** FAILED ***

dongjoon-hyun avatar Oct 17 '24 15:10 dongjoon-hyun

@xupefei mind taking a look please?

HyukjinKwon avatar Oct 23 '24 10:10 HyukjinKwon

On it.

xupefei avatar Oct 23 '24 14:10 xupefei

Trying out a fix at https://github.com/apache/spark/pull/48622.

xupefei avatar Oct 23 '24 15:10 xupefei