[FEA] Add multi-label node matching predicates for GFQL
name: Feature request about: Suggest an idea for this project title: "[FEA] Add multi-label node matching predicates for GFQL" labels: enhancement assignees: ''
Is your feature request related to a problem? Please describe.
Currently, GFQL provides the is_in() predicate for matching nodes that have any of several values (OR logic) for a given attribute. However, there's no built-in way to match nodes that must have multiple labels/values simultaneously (AND logic), which is a common pattern in graph databases like Neo4j where nodes can have multiple labels (e.g., :Person:Employee:Manager).
When working with dataframe-based graphs where labels might be stored as:
- Array/list columns containing multiple labels per node
- Multiple boolean columns (one per label)
- Delimited strings containing multiple labels
There's no elegant way to express "find nodes that have ALL of these labels" without resorting to query strings.
Describe the solution you'd like Add new predicates to support multi-label matching patterns:
-
contains_all(values)- For array/list columns, match if the column contains all specified valuesn({"labels": contains_all(["Person", "Employee"])}) # Node must have both labels -
contains_any(values)- Alias foris_in()but clearer for array columnsn({"labels": contains_any(["Person", "Organization"])}) # Node has at least one label -
has_labels(labels)- Specialized predicate for label matching (could handle various storage formats)n({"labels": has_labels(["Person", "Employee"])}) # Must have all labels
These predicates would complement the existing is_in() predicate and make GFQL more expressive for multi-label graph patterns.
Describe alternatives you've considered
-
Query strings - Currently possible but less readable and not type-safe:
n(query="'Person' in labels and 'Employee' in labels") -
Multiple boolean columns - Works but requires different schema:
n({"is_person": True, "is_employee": True}) -
Custom predicates - Users could implement their own, but built-in support would be better:
def contains_all(values): return lambda col: col.apply(lambda x: all(v in x for v in values))
Additional context
- This feature would make GFQL more compatible with multi-label graph patterns common in Neo4j and other graph databases
- The implementation could be optimized for vectorized operations on pandas/cuDF
- Should be documented in the language spec alongside other predicates
- Would enhance the synthesis examples for LLM-based code generation with multi-label patterns