spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-49086][CONNECT] Move ML function registration to SparkSessionExtensions

Open hvanhovell opened this issue 1 year ago • 3 comments

What changes were proposed in this pull request?

This PR moves ML function registration from the SparkConnectPlanner to the internal function registry. This registration is done using the SparkSessionExtensions mechanism.

Why are the changes needed?

Unification of Connect and Classic Column API. This PR decouples the ML functions from Catalyst expressions.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

No.

hvanhovell avatar Aug 06 '24 17:08 hvanhovell

@zhengruifeng this is failing org.apache.spark.ml.FunctionsLoadingSuite. Can I remove this? I can't make it error out anymore after this.

hvanhovell avatar Aug 19 '24 20:08 hvanhovell

@hvanhovell I think we can remove test org.apache.spark.ml.FunctionsLoadingSuite introduced in https://github.com/apache/spark/pull/43739, because we don't need objects vectorToArrayUdf and vectorToArrayFloatUdf any more

also cc @WeichenXu123 and @zsxwing who are the reviewers of https://github.com/apache/spark/pull/43739

zhengruifeng avatar Aug 22 '24 05:08 zhengruifeng

@hvanhovell wanted to merge this but there's a conflict ..

HyukjinKwon avatar Aug 26 '24 01:08 HyukjinKwon

Merging to master!

hvanhovell avatar Aug 28 '24 12:08 hvanhovell