graphstorm issues

[Change Request] Modify new `Evaluator` for the three tasks

The current GraphStorm `Evaluator`s have misleading naming convention and lack of default metric. The new `Evaluator`s will be task-specific. Sample code ``` evaluator = gs.eval.GSgnnClassificationEvaluator(eval_frequency=100) ``` **Requested changes** 1. Create...

zhjwy9343

break back compatibility

0.3

Evaluation metrics, e.g., accuracy and auc, can NOT be used at the same time

For a classification task, cannot use all metrics at the same time. Examples: Accuracy and auc_roc requires different prediction return values. Therefore they cannot be used in one training/evaluation epoch....

zhjwy9343

0.5

Use more efficient KNN methods for link prediction

We should implement a proper link prediction inference script. Users can provide a list of nodes and the link prediction inference script should return top K nodes that are likely...

zheng-da

Optimize GNN-Bert distillation user experience.

We need to support following user experience: - [ ] #513 - [ ] Resuming a distillation task from a saved checkpoint - [ ] Provide an end2end experience of...

classicsong

Add end2end test for GSDistilledModel in 0.2.1

```[tasklist] ### Tasks ```

HouyuZhang1007

[GSProcessing] GConstruct input sanity checker

We need a gconstruct config validator in general.

jalencato

gsprocessing

[GSProcessing] Annotation for GSProcessing Code

We are now using list[dict] for lots of input type. Actually we can upgrade that to a list[Mapping] if the input should be invariant.

jalencato

gsprocessing

Implemented hits@k metric for link-prediction

*Issue #, if available:* *Description of changes:* - Implemented hits@k metric for link-prediction - Refactored `GSgnnMrrLPEvaluator` and `GSgnnPerEtypeMrrLPEvaluator` to work with both mrr and hits@k By submitting this pull request,...

wangz10

ready

Reduce the memory overhead for text tokens in graph construction pipeline

1

We should use a field to store the valid length of a token list instead of using an attention mask.

zheng-da

Balance the training/validation/test set in graph partitioning

1

Currently, gconstruct doesn't enforce the balancing between training/validation/test sets in graph partitioning for node prediction tasks. If training/validation/test nodes are not evenly split across graph partitions, the node split algorithm...

zheng-da

graphstorm
graphstorm copied to clipboard

Metadata

[Change Request] Modify new `Evaluator` for the three tasks

Evaluation metrics, e.g., accuracy and auc, can NOT be used at the same time

Use more efficient KNN methods for link prediction

Optimize GNN-Bert distillation user experience.

Add end2end test for GSDistilledModel in 0.2.1

[GSProcessing] GConstruct input sanity checker

[GSProcessing] Annotation for GSProcessing Code

Implemented hits@k metric for link-prediction

Reduce the memory overhead for text tokens in graph construction pipeline

Balance the training/validation/test set in graph partitioning

← Metadata

Owner

Metadata

graphstorm graphstorm copied to clipboard

Metadata

← Metadata

Owner

Metadata

graphstorm
graphstorm copied to clipboard