tugraph-analytics About the Future of Geaflow

Hi, everyone. I organized some to-do lists

TODO/FIXME Cleanup and Implementation

I found a large number of uncompleted TODO comments, which are good improvement opportunities:

PipelineUtil.java: Asynchronous mode check logic needs refactoring: issue-607
OneDegreeGraphScanIterator.java: Graph proxy partition scan iterator needs to be implemented: issues-611
PartitionType.java: DT partitioning and label DT partitioning need to be supported
StaticGraphPaimonStoreBase.java: Tables need to be created using graph mode instead of KV tables
AbstractUnAlignedWorker.java: LoadGraphProcessEvent needs to be aligned : issue-609
UnAlignedComputeWorker.java: Dynamic/streaming scenario handling needs to be improved issue-660

Hard-coded and magic numbers I found many hard-coded strings and numbers:

ClusterConstants.java: Various prefixes and constants can be extracted as configuration
NettyMessage.java: The magic number 0xBADC0FFE needs to be constantized
StringLiteralUtil.java: The multiplier array {1000, 100, 10, 1} needs to be constantized.
LocalClient.java: The JSON template string needs to be extracted.

Exception Handling Improvements

Identified several areas for exception handling improvements:

ErrorApiResponse.java: Exception classification logic can be more refined.
ComponentUncaughtExceptionHandler.java: Exception handling can be more elegant.
SliceOutputChannelHandler.java: Exception handling can be more specific.

Code Duplication Identified several duplicate code patterns:

ListUtil.java: Duplicate logic in collection operations.
FunctionCallUtils.java: Duplicate code in type mapping.
QueryTester.java: Duplicate file reading logic.

In addition, I've noticed some issues with similarity calculations, such as Jaccard similarity and node similarity. Overall, the community's future plans fall into two main areas: 1. Increasing and maintaining basic capabilities; 2. Expanding GNN sampling and AI+graph capabilities. Does anyone have any additional comments? Feel free to discuss!

Oct 13 '25 03:10 kaori-seasons

Hi @kaori-seasons , Thank you for putting together such a detailed technical to-do list! These TODOs, hardcoded values, and exception handling issues definitely deserve priority cleanup—especially those related to core modules like graph storage, partitioning strategies, and streaming processing.

I’d like to suggest the following:

Break these items down into concrete sub-tasks to make it easier for community members to pick up and contribute.
For new capability areas like "similarity computation" and "GNN sampling," could we add a brief design sketch to guide implementation? We’d also love to see you get more deeply involved in building our community!

Nov 03 '25 03:11 Loognqiang

Hi @kaori-seasons , Thank you for putting together such a detailed technical to-do list! These TODOs, hardcoded values, and exception handling issues definitely deserve priority cleanup—especially those related to core modules like graph storage, partitioning strategies, and streaming processing.

I’d like to suggest the following:

Break these items down into concrete sub-tasks to make it easier for community members to pick up and contribute.

For new capability areas like "similarity computation" and "GNN sampling," could we add a brief design sketch to guide implementation? We’d also love to see you get more deeply involved in building our community!

Hello, I was away on a business trip last weekend. I'm glad to see your message. I think what you said is very necessary; I will be organizing the issues and adding relevant design drafts soon to attract more contributors.

Nov 10 '25 05:11 kaori-seasons