dgl icon indicating copy to clipboard operation
dgl copied to clipboard

[Graphbolt] Offline script to convert from COO to CSC sampling graph.

Open frozenbugs opened this issue 1 year ago • 3 comments

🔨Work Item

Such that it can be load directly to CSC sampling graph. IMPORTANT:

  • This template is only for dev team to track project progress. For feature request or bug report, please use the corresponding issue templates.
  • DO NOT create a new work item if the purpose is to fix an existing issue or feature request. We will directly use the issue in the project tracker.

Project tracker: https://github.com/orgs/dmlc/projects/2

Description

Depending work items or issues

frozenbugs avatar Jun 08 '23 04:06 frozenbugs

I've noticed that there's a CPU version implementation in DGL Code.

I wonder if we need to follow this code? My understanding is that if our only requirement is to convert a single COO to CSC, this might be sufficient.

keli-wen avatar Jun 29 '23 07:06 keli-wen

could we offer a utility API to convert COO(which could be homogeneous or heterogenous, unsorted) to a sorted CSCSamplingGraph? could we construct a DGLGraph first form coo data and convert via dgl.to_homogeneous() to obtain csc matrix?

Rhett-Ying avatar Jul 03 '23 05:07 Rhett-Ying

The PR: [Graphbolt] Add the preprocess_ondisk_dataset function. implemented a preprocess_ondisk_dataset() function:

  • Receive the input_config_path (a YAML file), extract the data from the graph field, and utilize the original data in .csv/.npy format to create a CSCSamplingGraph.
  • Convert all torch format data into numpy format.
  • Return the processed YAML file path.

keli-wen avatar Jul 19 '23 09:07 keli-wen