datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

feat: update FFI TableProvider and ExecutionPlan to use FFI Session and TaskContext

Open timsaucer opened this issue 1 month ago • 1 comments

Which issue does this PR close?

Addresses part of https://github.com/apache/datafusion/issues/18671 but does not close it.

Rationale for this change

This is the major change to address the requirements of #18671. This PR combines all of the previous PRs in the issue and uses them in FFI_TableProvider and FFI_ExecutionPlan. With this change the only remaining thing to close the issue is to remove the core crate. That is a large PR that mostly just changes import paths and will be a follow up.

What changes are included in this PR?

  • Update all structs in the FFI crate to use the FFI_PhysicalExpr, FFI_Session, FFI_TaskContext, and FFI_LogicalExtensionCodec.
  • Remove creation of SessionContext within the FFI crate
  • Updates unit tests

Are these changes tested?

Unit tests are added. Coverage report: Screenshot 2025-12-11 at 10 42 21 AM

Are there any user-facing changes?

Yes.

There is one major change to using the FFI crate that downstream users will need to implement. Now when creating a table provider, catalog provider, etc you need to provide a TaskContextProvider and an optional LogicalExtensionCodec.

  • [ ] TODO: Update migration guide to include discussion of all changes to this crate for the linked issue.

timsaucer avatar Dec 11 '25 09:12 timsaucer

@paleolimbot @comphead @milenkovicm @renato2099 Almost done with the FFI epic! There is only one PR after these and it's just an import cleanup job. I'm leaving this in draft until I write up the upgrade guide, but the code here is I believe in decent shape. This pulls together all of the previous work.

timsaucer avatar Dec 11 '25 10:12 timsaucer

I added a unit test to demonstrate the serialization/deserialization. It passes in this branch but fails against main. Results against main:

running 1 test
test tests::udf_as_input_to_udf ... FAILED

failures:

---- tests::udf_as_input_to_udf stdout ----
Error: Shared(Ffi("This feature is not implemented: PhysicalExtensionCodec is not provided for scalar function ffi_abs"))


failures:
    tests::udf_as_input_to_udf

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 2 filtered out; finished in 0.04s

@paleolimbot Does this demonstrate enough or do you think we need additional tests?

timsaucer avatar Dec 19 '25 18:12 timsaucer

Thank you @renato2099 @paleolimbot and @comphead !

timsaucer avatar Dec 20 '25 12:12 timsaucer