graphstorm
graphstorm copied to clipboard
Enterprise graph machine learning framework for billion-scale graphs for ML scientists and data scientists.
Currently GConstruct does not offer ways to handle missing values for any of its transformations. We'll need to add that to bring it to parity with GSProcessing
*Issue #, if available:* *Description of changes:* This PR adds the usage of edge_feat in the GATV2 model for homogeneous graph in gsf.py, gat_encoder.py. The corresponding unit-test is added in...
When use edge features, the `GSEdgeEncoderInputLayer` is trained. But in the `save_model()` function, only the Node encoder layer is saved. And during inference time, the `GSEdgeEncoderInputLayer` is restored by using...
Hi, I have trained a link prediction model on a large original graph using the GS CLI (`graphstorm.gconstruct.construct_graph` & `graphstorm.run.gs_link_prediction`). Now I am trying to switch to the GS Python...
In GraphStorm, there are two main methods to use language models (LMs) during graph construction: 1/ tokenizing text attributes into integers of tokens by using with HuggingFace tokenizer;2/ embedding text...
*Issue #, if available:* *Description of changes:* * Add support for creating an HPO step in SM pipeline By submitting this pull request, I confirm that you can use, modify,...
*Issue #, if available:* *Description of changes:* * Avoid any shared memory outage because of too many categories. * GSProcessing will use the count order to to determine the first...
*Issue #, if available:* https://github.com/awslabs/graphstorm/issues/1242 *Description of changes:* By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your...
Issue we have observed because distributed_executor.py contains lots of code and sometimes includes breaking changes between versions: * The entry point is uploaded and attached during job launch/pipeline creation, from...