tugraph-analytics GeaFlow Technical Roadmap

This roadmap outlines our planned features and improvements for GeaFlow. While we will strive to deliver these features according to the following schedule, the priorities may change based on community feedback and internal demands.

Short-term(2025.6-2025.12)
- Streaming Graph Capability Enhancement
  - Support for filter pushdown
  - Custom serialization
  - Custom Partitioning Strategy
  - Streaming integration of batch operators such as Hive tables
  - Support for msgCombine
  - Common stream and batch operators
  - Asynchronous coroutine scheduling
  - Multi-level FO & HA
- Store
  - Native CStore Integration
  - Columnar Storage Support
  - Time Travel
  - Vertex and edge support for label + time-based partitioning
- High-Performance Graph Analysis
  - Support for GQL standards (long-term)
- AI Ecosystem Development
  - Graph + vector indexing
  - Support for 1-2 GNN algorithms
- Paimon Ecosystem Development
  - Support for Paimon batch source connector
  - Support for Paimon stream source connector
  - Platform support for Paimon table management and operations
  - Support for distributed graph read/write
  - Support for sorted detail table read/write
  - Support for dynamic graphs
Long-term(Next Year)
- GQL Standard Support
- Integrated Graph Lakehouse Capability
- Streaming Graph Capability Enhancement
  - Columnar storage integration
  - Parallel execution of multiple queries
- High-Performance Graph Analysis
  - Vectorized Computing
  - CBO Optimizer
  - CodeGen
  - Dynamic Schema

Although we have our own technical roadmap, we sincerely hope and look forward to community developers participating in the collaborative construction. For example, supporting ISO/GQL standard, building the AI ecosystem, expanding the graph's lakehouse to ecosystems such as Paimon/Hudi/Iceberg, and integrating with the shuffle manager ecosystem like Celeborn, and more.

May 28 '25 12:05 Loognqiang

Thank you for the clear road map, but it still looks like a pretty rough plan to me. Because:

It doesn't include specific plans for people and resources;
It doesn't add these features to certain versions.

I think you should

Break up larger functions into smaller discussions.
Giving clear due dates for completing the plan to particular people who will be responsible for carrying out each function.

Jun 14 '25 15:06 mingcheng

Hi @mingcheng , Thank you very much for your valuable feedback! Your observations are absolutely spot-on—the current roadmap indeed lacks sufficient detail in terms of execution.

We will take the following steps next:

Refine version planning: Clearly assign features to specific releases such as v0.8, v0.9, etc.
Define clear timelines: Attract suitable developers (committers or contributors) for each key task and set reasonable delivery deadlines.
Break down large features: For example, decompose "Graph Lakehouse capabilities" into subtasks such as read/write support for static and dynamic graphs in the lakehouse, index-based query optimization, etc., and initiate separate design discussions for each subtask.

We warmly welcome everyone to claim tasks they’re interested in via follow-up emails or GitHub issues—let’s work together to turn this plan into reality! We also greatly appreciate any further suggestions you may have to help us further improve the roadmap.

Nov 03 '25 03:11 Loognqiang

@Loognqiang @mingcheng Thank you both for your hard work. I look forward to making more contributions to Apache Geaflow.

Nov 03 '25 05:11 kitalkuyo-gita

Hi @kitalkuyo-gita ,

Thank you for your recent contributions to the apache geaflow community. We also look forward to continuing to invest in the development and construction of the community together.

Nov 03 '25 06:11 Loognqiang