datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Add Substrait roundtrip support for `RecursiveQuery` and recursive CTE scans

Open kosiew opened this issue 1 month ago • 1 comments

Which issue does this PR close?

  • Closes #16274.

Rationale for this change

Substrait roundtrip mode currently fails for plans that include RecursiveQuery, resulting in not_impl_err!("Unsupported plan type: RecursiveQuery") during SQL logic tests. This prevents recursive CTE queries from being roundtripped through Substrait, causing multiple cte.slt cases to fail and reducing confidence in Substrait interoperability.

This PR adds end-to-end support for serializing and deserializing RecursiveQuery logical plans and their associated recursive work-table scans. This unblocks the failing SQL logic tests and improves parity between the DataFusion logical plan representation and its Substrait encoding.

What changes are included in this PR?

  • New logical_plan::recursive helper module

    • Introduces RECURSIVE_QUERY_TYPE_URL and RECURSIVE_SCAN_TYPE_URL to tag recursive structures in Substrait extensions.

    • Defines small prost messages for:

      • RecursiveQueryDetail { name, is_distinct } used to carry RecursiveQuery metadata.
      • RecursiveScanDetail { name } used to identify recursive work-table scans.
    • Provides helpers to encode/decode these messages:

      • encode_recursive_query_detail / decode_recursive_query_detail.
      • encode_recursive_scan_detail / decode_recursive_scan_detail.
    • Adds validation so that empty names and malformed payloads are reported as Substrait errors.

  • Producer-side support for RecursiveQuery

    • Adds logical_plan/producer/rel/recursive_query_rel.rs implementing:

      • from_recursive_query, which serializes LogicalPlan::RecursiveQuery into ExtensionMultiRel with two inputs:

        • inputs[0]: static_term.
        • inputs[1]: recursive_term.
      • Encodes { name, is_distinct } into the detail field using RECURSIVE_QUERY_TYPE_URL.

    • Wires this into the generic Substrait producer:

      • Extends SubstraitProducer with handle_recursive_query.
      • Updates to_substrait_rel to delegate LogicalPlan::RecursiveQuery to handle_recursive_query instead of returning not_impl_err!.
  • Consumer-side support for RecursiveQuery

    • Adds logical_plan/consumer/rel/recursive_query_rel.rs implementing:

      • from_recursive_query_rel, which reconstructs LogicalPlan::RecursiveQuery from ExtensionMultiRel.

      • Validates that:

        • The extension has exactly two inputs.
        • The detail field is present and decodes successfully.
      • Rebuilds RecursiveQuery { name, is_distinct, static_term, recursive_term }.

    • Integrates with DefaultSubstraitConsumer:

      • Detects RECURSIVE_QUERY_TYPE_URL in ExtensionMultiRel and routes to from_recursive_query_rel.
      • Falls back to the existing extension handling for other type URLs.
  • Support for recursive CTE work-table scans

    • Producer (TableScan → ReadRel)

      • Detects CteWorkTable-backed scans via DefaultTableSource.

      • Adds a helper recursive_scan_name(&TableScan) -> Option<String> that:

        • Downcasts to DefaultTableSource and then to CteWorkTable.
        • Confirms that the scan’s table name matches the work-table name.
      • When a CteWorkTable is detected, sets ReadRel.advanced_extension with:

        • type_url = RECURSIVE_SCAN_TYPE_URL.
        • value = encode_recursive_scan_detail(name).
    • Consumer (ReadRel → LogicalPlan::TableScan)

      • Parses ReadRel.advanced_extension and, when type_url == RECURSIVE_SCAN_TYPE_URL, decodes the recursive scan detail.
      • Verifies the recursive scan name matches the TableReference table name and returns a Substrait error if they differ.
      • Uses a CteWorkTable wrapped in a TableSource instead of resolving a regular catalog table for recursive scans.
      • Falls back to the existing resolve_table_ref logic for non-recursive scans.
  • Refactoring in ReadRel consumer

    • Replaces the earlier read_with_schema helper with a more flexible read_with_source helper that accepts a TableSource (either a catalog table or a CteWorkTable).
    • Ensures that schema compatibility checks are still performed after building the scan plan.
  • Tests

    • Adds unit tests for producer-side recursive scan encoding:

      • from_table_scan_sets_advanced_extension_for_cte_work_table ensures that:

        • A TableScan over a CteWorkTable uses RECURSIVE_SCAN_TYPE_URL in advanced_extension.
        • The encoded payload decodes back to the correct table name.
    • Adds roundtrip and error-coverage tests for recursive queries and scan details in roundtrip_logical_plan.rs:

      • roundtrip_recursive_query verifies that a RecursiveQuery roundtrips through Substrait and back, preserving the name and is_distinct flag.
      • serialize_recursive_query_with_empty_name_errors checks that encoding fails with a clear error when the RecursiveQuery name is empty.
      • decode_recursive_query_detail_malformed_bytes_errors ensures malformed bytes produce a descriptive Substrait error.
      • decode_recursive_scan_detail_malformed_bytes_errors similarly validates error handling for malformed recursive scan detail.
      • roundtrip_recursive_query_distinct confirms that is_distinct = true is preserved across the roundtrip.
      • roundtrip_recursive_query_preserves_child_plans does a structural sanity check that the main characteristics of the child plans (projections, filters, table scans) survive the roundtrip.
      • roundtrip_recursive_query_with_work_table_scan_executes runs a real recursive CTE (balances example) through Substrait roundtrip and executes it, asserting non-empty results.

Are these changes tested?

Before

> cargo test --test sqllogictests -- --substrait-round-trip cte.slt:175
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.65s
     Running bin/sqllogictests.rs (target/debug/deps/sqllogictests-79fc77e66b850bad)
Completed 1 test files in 0 seconds                                          External error: 1 errors in file ...

1. query failed: DataFusion error: This feature is not implemented: Unsupported plan type: RecursiveQuery....

After

❯ cargo test --test sqllogictests -- --substrait-round-trip cte.slt:175
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.78s
     Running bin/sqllogictests.rs (target/debug/deps/sqllogictests-79fc77e66b850bad)
Completed 1 test files in 0 seconds

Yes.

  • New and updated tests have been added to cover:

    • RecursiveQuery Substrait serialization and deserialization.
    • Encoding/decoding of RecursiveQueryDetail and RecursiveScanDetail, including malformed-byte and empty-name error paths.
    • Correct tagging of CteWorkTable scans via advanced_extension.
    • End-to-end execution of a recursive CTE query after Substrait roundtrip.
  • Existing Substrait roundtrip tests continue to run and provide regression coverage for non-recursive plans.

Are there any user-facing changes?

  • Behavioral improvement (no breaking API changes):

    • Substrait roundtrip mode now supports logical plans that contain RecursiveQuery and recursive CTE work-table scans. Previously, such plans failed with not_impl_err!("Unsupported plan type: RecursiveQuery").
    • Users who rely on Substrait serialization/deserialization for queries with recursive CTEs should now see these queries roundtrip and execute successfully.
  • No public Rust API signatures are changed, and no configuration flags or SQL syntax are modified.

LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

kosiew avatar Dec 07 '25 13:12 kosiew

hi @gabotechs Can you take a look?

kosiew avatar Dec 11 '25 03:12 kosiew

Sure! will take a look soon

gabotechs avatar Dec 15 '25 15:12 gabotechs