pytorch_geometric
pytorch_geometric copied to clipboard
Added an example of vthost's `DAGNN` for the `ogbg-code2` dataset
I've made a clean, minimal re-implementation of vthost's DAGNN architecture. In a single run, the best results I obtained were after epoch 50/50 (I should have run for more), with the following F1 scores:
Train: 0.3848, Val: 0.1646, Test: 0.1814
These results align with the ogbg-code2
leaderboard, and also with the authors' claim that the model converges after about 45 epochs. With the settings I provided, it takes roughly 6.5 minutes to complete an epoch with one A100 GPU and one AMD EPYC 7413 24-Core Processor.
Appreciate the work @ArchieGertsman. The
DAGNN
module you've added is a new GNN module. Can we add it totorch_geometric/contrib/nn
, so that it can be more widely used?
Sounds good to me! What needs to be changed before adding DAGNN
to contrib/nn
? I assume the code related to ogbg-code2
should be removed to make it more general. I'm not sure how to best handle the dag layer edge masking part, i.e. where to create the masks and where to collate them for batching. Currently, the masks are created via Dataset.transform
and are collated using a custom collate_fn
provided to a torch.DataLoader
.
Thank you all for working on that, this is awesome! I am however wondering if we should leave DAGNN as example and maybe directly use our newest proposal for the model version. It is basically doing the same but solves the efficiency problem. We still have to update the arxiv version, but we also have experiments with NodeFormer and GraphGPS by now. It's in transformer style, but can be implemented using message passing. I am happy with any solution you propose, just wanted to bring it up! https://github.com/LUOyk1999/DAGformer https://arxiv.org/pdf/2210.13148.pdf
edit: For completeness, I want to also point out PACE, a transformer-based model with similar goal from @zehao-dong and @muhanzhang. We accidentally completely missed that up to now.
Thank you all for working on that, this is awesome! I am however wondering if we should leave DAGNN as example and maybe directly use our newest proposal for the model version. It is basically doing the same but solves the efficiency problem. We still have to update the arxiv version, but we also have experiments with NodeFormer and GraphGPS by now. It's in transformer style, but can be implemented using message passing. I am happy with any solution you propose, just wanted to bring it up! https://github.com/LUOyk1999/DAGformer https://arxiv.org/pdf/2210.13148.pdf
Hi Veronika, thanks again for your great work. I'm finding that DAGNN is easier to train than the DAGformer in my domain (e.g. finding the right hyperparameters), at the expense of training & inference time. Because of this tradeoff, I believe that DAGNN is still a valuable architecture. I agree that it would be good to see DAGformer in PyG as well.
This is interesting, thank you for the feedback!
What needs to be changed before adding
DAGNN
tocontrib/nn
?
I'm not part of the team; just interested in this model. I think not much would need to change. Examples of other similar merged PRs:
- https://github.com/pyg-team/pytorch_geometric/pull/8287
- https://github.com/pyg-team/pytorch_geometric/pull/7370