tugraph-analytics icon indicating copy to clipboard operation
tugraph-analytics copied to clipboard

Recommend adding support for PGQL

Open MuYiYong opened this issue 2 months ago • 4 comments

ref: https://pgql-lang.org/spec/2.1/

The reasons are as follows:

  1. It is a standardized language, which facilitates learning and promotion;
  2. PGQL naturally integrates well with SQL;
  3. It contributes to better integration with relational databases such as Oracle/Spanner.

MuYiYong avatar Oct 21 '25 08:10 MuYiYong

I appreciate your insights regarding PGQL. You're right that PGQL has many strengths worth considering:

  • As a standardized language, it lowers the learning curve and facilitates broader adoption;
  • Its natural alignment with SQL provides excellent interoperability;
  • The design enables deep integration with relational databases like Oracle and Spanner.

We agree that PGQL shares notable similarities with ISO-GQL in several areas. The GeaFlow community is actively working toward ISO-GQL compliance for our graph query language. At the same time, we've introduced carefully considered SQL extensions to allow seamless combination of relational and graph operations — enabling users to leverage both paradigms where appropriate. This pragmatic integration may serve as a constructive starting point for harmonizing SQL and GQL syntaxes while delivering enhanced usability and expressiveness.

The community value such feedback as we evolve our language support and welcome further discussion on balancing standards compliance with practical utility.

Leomrlin avatar Oct 21 '25 12:10 Leomrlin

Thank you for your response.

I tend to favor using a pure GQL or pure PGQL implementation, as they are defined by standards. Proprietary or heavily customized languages often introduce significant learning costs for users. Defining a language is also an extremely challenging task, a lesson we've learned from our own years of experience in similar efforts.

I understand that GeaFlow has substantial existing SQL operations and data sources. For this reason, I believe pure PGQL could satisfy the current requirements and would be a more suitable fit, as it aligns well with SQL.

Furthermore, following a standardized path can enhance the project's influence. I would even suggest that GeaFlow could actively participate in the PGQL standard process itself, helping to shape and influence its future direction.

MuYiYong avatar Oct 22 '25 03:10 MuYiYong

@MuYiYong Hello, thank you for your suggestion. I have provided a detailed design plan. Please take a look at it when you have time. It may help you implement it.

Overview

Current State Analysis

Existing GQL Implementation

The current GeaFlow DSL implements a GQL-like syntax with the following key components:

  1. Parser Layer:

    • Uses Apache Calcite parser framework
    • Grammar defined in FreeMarker template files
    • Custom SqlKind extensions for GQL constructs
  2. SQL Node Layer:

    • SqlMatchPattern for MATCH clauses
    • SqlReturnStatement for RETURN clauses
    • SqlFilterStatement for FILTER clauses
    • SqlLetStatement for LET clauses
  3. Operator Layer:

    • Custom SqlOperator implementations for each GQL construct
    • SqlMatchPatternOperator, SqlReturnOperator, etc.
  4. Validation Layer:

    • GQLValidatorImpl for semantic validation
    • Custom namespace and scope handling
  5. Relational Algebra Layer:

    • GQLToRelConverter for converting to relational algebra
    • Graph-specific relational operators

PGQL 2.1 Features to Implement

Core Feature: OPTIONAL MATCH

The OPTIONAL MATCH clause allows for pattern matching where unmatched patterns result in null values rather than excluding the entire result row.

Example:

SELECT p.name, e.relationship
MATCH (p:Person) -[e:KNOWS]-> (f:Person)
OPTIONAL MATCH (f) -[r:LIKES]-> (t:Technology)
WHERE p.name = 'John'

Implementation Plan

1. Parser Layer Changes

1.1 Grammar Definition

Update the FreeMarker template gqlQuery.ftl to support OPTIONAL MATCH:

SqlCall GQLQuery() :
{
    SqlCall statement = null;
}
{
    (
        statement = GQLMatchStatement()
        |
        statement = GQLOptionalMatchStatement()  // Add this option
        |
        // ... existing options
    )
    {
        return statement;
    }
}

SqlCall GQLOptionalMatchStatement() :
{
      SqlCall statement = null;
}
{
      <OPTIONAL> <MATCH> statement = SqlOptionalMatchPattern(statement)
      (
        (
          statement = SqlLetStatement(statement) (<COMMA> statement = SqlLetStatement(statement))*
          [
            [ <NEXT> ] <OPTIONAL> <MATCH>
            statement = SqlOptionalMatchPattern(statement)
          ]
        )
        |
        (
           [ <NEXT> ] <OPTIONAL> <MATCH>
           statement = SqlOptionalMatchPattern(statement)
        )
      )*
      (
          statement = SqlReturn(statement)
          [
              <THEN>
              statement = SqlFilter(statement)
          ]
      )*
      {
          return statement;
      }
}

1.2 SqlOptionalMatchPattern Method

Add the parsing method for OPTIONAL MATCH patterns:

SqlCall SqlOptionalMatchPattern(SqlNode from) :
{
    SqlNodeList pathPatterns = null;
    SqlNode where = null;
    SqlNodeList orderBy = null;
    SqlNode limit = null;
    Span s = Span.of();
}
{
    pathPatterns = PathPatternList()
    [
        <WHERE> where = Expression(ExprContext.ACCEPT_SUB_QUERY)
    ]
    [ orderBy = OrderBy(true) ]
    [ <LIMIT> limit = UnsignedNumericLiteralOrParam() ]
    {
        return new SqlOptionalMatchPattern(s.end(this), from, pathPatterns, where, orderBy, limit);
    }
}

2. SQL Node Layer Implementation

2.1 SqlOptionalMatchPatternOperator

Create SqlOptionalMatchPatternOperator.java](file:///f:/gitrepo/tugraph-analytics/geaflow/geaflow-dsl/geaflow-dsl-parser/src/main/java/org/apache/geaflow/dsl/operator/SqlOptionalMatchPatternOperator.java):

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package org.apache.geaflow.dsl.operator;

import org.apache.calcite.sql.*;
import org.apache.calcite.sql.parser.SqlParserPos;
import org.apache.calcite.sql.type.ReturnTypes;
import org.apache.geaflow.dsl.sqlnode.SqlOptionalMatchPattern;

public class SqlOptionalMatchPatternOperator extends SqlOperator {

    public static final SqlOptionalMatchPatternOperator INSTANCE = new SqlOptionalMatchPatternOperator();

    private SqlOptionalMatchPatternOperator() {
        super("OptionalMatchPattern", SqlKind.OTHER, 2, true,
            ReturnTypes.SCOPE, null, null);
    }

    @Override
    public SqlCall createCall(
        SqlLiteral functionQualifier,
        SqlParserPos pos,
        SqlNode... operands) {
        return new SqlOptionalMatchPattern(pos, operands[0], (SqlNodeList) operands[1], operands[2],
            (SqlNodeList) operands[3], operands[4]);
    }

    @Override
    public SqlSyntax getSyntax() {
        return SqlSyntax.SPECIAL;
    }

    @Override
    public void unparse(
        SqlWriter writer,
        SqlCall call,
        int leftPrec,
        int rightPrec) {
        call.unparse(writer, leftPrec, rightPrec);
    }
}

2.2 SqlOptionalMatchPattern

Create SqlOptionalMatchPattern.java:

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package org.apache.geaflow.dsl.sqlnode;

import java.util.List;
import org.apache.calcite.sql.*;
import org.apache.calcite.sql.parser.SqlParserPos;
import org.apache.calcite.sql.validate.SqlValidator;
import org.apache.calcite.sql.validate.SqlValidatorScope;
import org.apache.calcite.util.ImmutableNullableList;
import org.apache.geaflow.dsl.operator.SqlOptionalMatchPatternOperator;

public class SqlOptionalMatchPattern extends SqlCall {

    private SqlNode from;

    private SqlNodeList pathPatterns;

    private SqlNode where;

    private SqlNodeList orderBy;

    private SqlNode limit;

    public SqlOptionalMatchPattern(SqlParserPos pos, SqlNode from, SqlNodeList pathPatterns,
                           SqlNode where, SqlNodeList orderBy, SqlNode limit) {
        super(pos);
        this.from = from;
        this.pathPatterns = pathPatterns;
        this.where = where;
        this.orderBy = orderBy;
        this.limit = limit;
    }

    @Override
    public SqlOperator getOperator() {
        return SqlOptionalMatchPatternOperator.INSTANCE;
    }

    @Override
    public List<SqlNode> getOperandList() {
        return ImmutableNullableList.of(getFrom(), getPathPatterns(), getWhere(),
            getOrderBy(), getLimit());
    }

    @Override
    public SqlKind getKind() {
        return SqlKind.OTHER;  // TODO: Define custom SqlKind for GQL_OPTIONAL_MATCH_PATTERN
    }

    @Override
    public void validate(SqlValidator validator, SqlValidatorScope scope) {
        validator.validateQuery(this, scope, validator.getUnknownType());
    }

    public SqlNode getFrom() {
        return from;
    }

    public void setFrom(SqlNode from) {
        this.from = from;
    }

    public SqlNodeList getOrderBy() {
        return orderBy;
    }

    public SqlNode getLimit() {
        return limit;
    }

    public void setOrderBy(SqlNodeList orderBy) {
        this.orderBy = orderBy;
    }

    public void setLimit(SqlNode limit) {
        this.limit = limit;
    }

    @Override
    public void setOperand(int i, SqlNode operand) {
        switch (i) {
            case 0:
                this.from = operand;
                break;
            case 1:
                this.pathPatterns = (SqlNodeList) operand;
                break;
            case 2:
                this.where = operand;
                break;
            case 3:
                this.orderBy = (SqlNodeList) operand;
                break;
            case 4:
                this.limit = operand;
                break;
            default:
                throw new IllegalArgumentException("Illegal index: " + i);
        }
    }

    @Override
    public void unparse(SqlWriter writer, int leftPrec, int rightPrec) {
        writer.keyword("OPTIONAL MATCH");
        if (pathPatterns != null) {
            for (int i = 0; i < pathPatterns.size(); i++) {
                if (i > 0) {
                    writer.print(", ");
                }
                pathPatterns.get(i).unparse(writer, leftPrec, rightPrec);
                writer.newlineAndIndent();
            }
        }
        if (where != null) {
            writer.keyword("WHERE");
            where.unparse(writer, 0, 0);
        }
        if (orderBy != null && orderBy.size() > 0) {
            writer.keyword("ORDER BY");
            for (int i = 0; i < orderBy.size(); i++) {
                SqlNode label = orderBy.get(i);
                if (i > 0) {
                    writer.print(",");
                }
                label.unparse(writer, leftPrec, rightPrec);
            }
            writer.newlineAndIndent();
        }
        if (limit != null) {
            writer.keyword("LIMIT");
            limit.unparse(writer, leftPrec, rightPrec);
        }
    }

    public SqlNodeList getPathPatterns() {
        return pathPatterns;
    }

    public SqlNode getWhere() {
        return where;
    }

    public final boolean isDistinct() {
        return false;
    }

    public void setWhere(SqlNode where) {
        this.where = where;
    }

    public boolean isSinglePattern() {
        return pathPatterns.size() == 1 && pathPatterns.get(0) instanceof SqlPathPattern;
    }
}

3. Validation Layer Changes

3.1 Update GQLValidatorImpl

Modify GQLValidatorImpl.java to handle OPTIONAL MATCH:

// In registerOtherFrom method
switch (node.getKind()) {
    case GQL_RETURN:
    case GQL_FILTER:
    case GQL_MATCH_PATTERN:
    case GQL_OPTIONAL_MATCH_PATTERN:  // Add this case
    case GQL_LET:
        // ... existing code
}

// In registerOtherKindQuery method
switch (node.getKind()) {
    case GQL_RETURN:
        // ... existing code
    case GQL_OPTIONAL_MATCH_PATTERN:  // Add this case
        SqlOptionalMatchPattern optionalMatchPattern = (SqlOptionalMatchPattern) node;
        GQLScope optionalMatchPatternScope = new GQLScope(parentScope, optionalMatchPattern);
        GQLOptionalMatchPatternNamespace optionalMatchNamespace = 
            new GQLOptionalMatchPatternNamespace(this, optionalMatchPattern);
        registerNamespace(usingScope, alias, optionalMatchNamespace, forceNullable);
        
        if (optionalMatchPattern.getWhere() != null) {
            GQLScope whereScope = new GQLScope(optionalMatchPatternScope, optionalMatchPattern.getWhere());
            registerNamespace(whereScope, alias, optionalMatchNamespace, forceNullable);
            scopes.put(optionalMatchPattern.getWhere(), whereScope);
        }
        if (optionalMatchPattern.getOrderBy() != null) {
            GQLReturnOrderByScope orderByScope = new GQLReturnOrderByScope(optionalMatchPatternScope,
                optionalMatchPattern.getOrderBy());
            registerNamespace(orderByScope, alias, optionalMatchNamespace, forceNullable);
            scopes.put(optionalMatchPattern.getOrderBy(), orderByScope);
        }
        break;
    // ... existing cases
}

3.2 Create GQLOptionalMatchPatternNamespace

Create a new namespace class for OPTIONAL MATCH patterns:

public class GQLOptionalMatchPatternNamespace extends GQLMatchPatternNamespace {
    
    public GQLOptionalMatchPatternNamespace(GQLValidatorImpl validator, 
                                          SqlOptionalMatchPattern optionalMatchPattern) {
        super(validator, optionalMatchPattern);
    }
    
    @Override
    public SqlNode getNode() {
        return (SqlOptionalMatchPattern) super.getNode();
    }
}

4. Relational Algebra Layer Changes

4.1 Update GQLToRelConverter

Modify GQLToRelConverter.java to handle OPTIONAL MATCH:

// In convertQueryRecursive method
switch (kind) {
    case GQL_FILTER:
        return convertGQLFilter((SqlFilterStatement) query, top, withBb);
    case GQL_RETURN:
        return convertGQLReturn((SqlReturnStatement) query, top, withBb);
    case GQL_MATCH_PATTERN:
        return convertGQLMatchPattern((SqlMatchPattern) query, top, withBb);
    case GQL_OPTIONAL_MATCH_PATTERN:  // Add this case
        return convertGQLOptionalMatchPattern((SqlOptionalMatchPattern) query, top, withBb);
    case GQL_LET:
        return convertGQLLet((SqlLetStatement) query, top, withBb);
    // ... existing cases
}

// Add new conversion method
private RelNode convertGQLOptionalMatchPattern(SqlOptionalMatchPattern optionalMatchPattern, 
                                              boolean top, Blackboard withBb) {
    // Implementation similar to convertGQLMatchPattern but with outer join semantics
    // for optional pattern matching
    RelNode input = convertFrom(optionalMatchPattern.getFrom(), withBb);
    
    // Convert the optional match pattern to a relational algebra expression
    // using left outer join to handle the optional semantics
    RelNode optionalMatchRel = convertPathPatterns(optionalMatchPattern.getPathPatterns(), 
                                                  optionalMatchPattern.getWhere());
    
    // Apply left outer join between input and optional match
    return LogicalOptionalMatch.create(input, optionalMatchRel);
}

4.2 Create LogicalOptionalMatch Operator

Create a new relational operator for OPTIONAL MATCH:

public class LogicalOptionalMatch extends LogicalJoin {
    
    public LogicalOptionalMatch(RelOptCluster cluster, RelTraitSet traits, 
                               RelNode left, RelNode right, RexNode condition,
                               Set<CorrelationId> variablesSet, JoinRelType joinType) {
        super(cluster, traits, left, right, condition, variablesSet, joinType);
    }
    
    public static LogicalOptionalMatch create(RelNode left, RelNode right) {
        RexNode condition = RexLiteral.createBoolean(true, 
            left.getCluster().getRexBuilder().getTypeFactory().createSqlType(SqlTypeName.BOOLEAN));
        return new LogicalOptionalMatch(left.getCluster(), left.getTraitSet(), 
                                       left, right, condition, ImmutableSet.of(), 
                                       JoinRelType.LEFT);
    }
    
    @Override
    public RelNode copy(RelTraitSet traitSet, RexNode conditionExpr, 
                       RelNode left, RelNode right, JoinRelType joinType, 
                       boolean semiJoinDone) {
        return new LogicalOptionalMatch(getCluster(), traitSet, left, right, 
                                       conditionExpr, getVariablesSet(), joinType);
    }
}

5. Runtime Layer Changes

5.1 Update QueryContext

Modify QueryContext.java to recognize OPTIONAL MATCH:

// In getCommand method
switch (kind) {
    case SELECT:
    case GQL_FILTER:
    case GQL_MATCH_PATTERN:
    case GQL_OPTIONAL_MATCH_PATTERN:  // Add this case
    case GQL_RETURN:
        // ... existing code
}

6. Utility and Helper Classes

6.1 Update GQLNodeUtil

Modify GQLNodeUtil.java to detect OPTIONAL MATCH:

public static boolean containOptionalMatch(SqlNode node) {
    return !collect(node, n -> n.getKind() == SqlKind.GQL_OPTIONAL_MATCH_PATTERN).isEmpty();
}

Testing Plan

Unit Tests

  1. Parser Tests:

    • Test OPTIONAL MATCH parsing with various patterns
    • Test OPTIONAL MATCH with WHERE clauses
    • Test OPTIONAL MATCH with ORDER BY and LIMIT
  2. Validation Tests:

    • Test semantic validation of OPTIONAL MATCH patterns
    • Test variable scoping in OPTIONAL MATCH
  3. Relational Algebra Tests:

    • Test conversion to relational algebra
    • Test outer join semantics
  4. Runtime Tests:

    • Test execution with sample graph data
    • Test null value handling

Integration Tests

  1. Complex Query Tests:

    • Test OPTIONAL MATCH combined with regular MATCH
    • Test nested OPTIONAL MATCH patterns
    • Test OPTIONAL MATCH with RETURN and FILTER
  2. Performance Tests:

    • Test query performance with large graphs
    • Test memory usage patterns

kitalkuyo-gita avatar Oct 27 '25 04:10 kitalkuyo-gita

Although PGQL is a standard, it is rarely used in daily production environments, especially since the standard was released relatively late. Previously, everyone used Gremlin. Therefore, I think it is a good suggestion, but considering the actual application scenarios, its priority should not be very high.

yaozhongq avatar Nov 04 '25 09:11 yaozhongq