pygraphistry icon indicating copy to clipboard operation
pygraphistry copied to clipboard

GFQL: Path predicates & path mode

Open lmeyerov opened this issue 1 month ago • 0 comments

Why

Expose paths when users request them (pay-as-you-go) while keeping default set semantics fast.

Deliverables

  • Syntax: MATCH PATH p = a->b->c WHERE … RETURN p
    • WHERE uses the same same-path semantics over named steps as non-PATH GFQL queries
  • Execution plan:
    • Run F/B/F (+ WHERE) to prune the subgraph (set semantics)
    • Enumerate paths only on the pruned graph, using sparse gathers per step
  • Factorized result container with lazy enumeration, row caps, and streaming
  • Optional path predicates (length, simple/non-simple, scoring) that trigger path mode and are not available in non-PATH queries

Optimization Modes & Switches

  • Factorize outputs and enumerate on demand (Kùzu-style)
  • Use sideways information passing to push semijoin filters and avoid scanning unrelated adjacency/properties
  • Maintain CSR/CSC adjacency in device memory; align edge-property columns for sequential reads

Acceptance

  • Matches enumerator oracle on small graphs; respects row caps and streaming
  • Benchmarks vs DataFrame path-table baseline, reporting intermediates, peak memory, wall time

References

  • GraphFrames motif/path-table baseline for opt-in path mode
  • Kùzu factorized enumeration + CIDR papers for lazy output strategies

lmeyerov avatar Nov 17 '25 08:11 lmeyerov