HGP-SL icon indicating copy to clipboard operation
HGP-SL copied to clipboard

About the acc from this repo and the paper

Open lygztq opened this issue 4 years ago • 15 comments

Hi, your work is excellent. However I find a gap between results obtained via your code and results reported in your paper. Specifically:

Mutagenicity NCI109 NCI1 DD
Code 79.68(1.68) 73.86(1.72) 76.29(2.14) 75.46(3.86)
Paper 82.15(0.58) 80.67(1.16) 78.45(0.77) 80.96(1.26)

I follow your hyper-parameter settings and do not change any part of your code. Why? I'm using Python 3.7 with the latest version of pytorch and pytorch_geometric. Is it possible that unmatched version of pyg causes such a gap?

lygztq avatar Jan 04 '21 02:01 lygztq

Hi,

This is very strange. Let me check it.

cszhangzhen avatar Jan 06 '21 02:01 cszhangzhen

Same for the PROTEINS dataset, only able to achieve 72% accuracy (same code, same version, same config) compared to 80%+ in the paper. Are there any other hyperparameters tuned to get 80% accu?

ScottLiao920 avatar Jan 26 '21 03:01 ScottLiao920

Hi,

Sorry for the late reply.

I found this is caused by the updating of torch-sparse. After the release of torch-sparse 0.4.4 (reference here), spspmm does not support autograd. However, our model needs to learn the sparse adjacent matrix. I suggest to use older version of torch-sparse.

cszhangzhen avatar Mar 24 '21 02:03 cszhangzhen

Actually, I also tried your 'full-graph' version (which has no spspmm operation) and got similar results...

Mutagenicity NCI109 NCI1 DD
Reported in Paper 82.15(0.58) 80.67(1.16) 78.45(0.77) 80.96(1.26)
Author's Code (full graph) 78.44(2.10) 74.44(2.05) 77.37(2.09) OOM
Author's Code (sample) 79.68(1.68) 73.86(1.72) 76.29(2.14) 75.46(3.86)

Also, it is worth pointing out that these three options --sample_neighbor, --sparse_attention and --structure_learning are useless since python will treat all non-zero strings as True.

lygztq avatar Mar 24 '21 03:03 lygztq

These three options --sample_neighbor, --sparse_attention, --structure_learning are defined as bool type, so they should be initialized as True or False.

I also find the performance drops when updating pytorch_geometric to the latest version, and I'm not sure whether this is merely caused by spspmm. I will check this in detail.

cszhangzhen avatar Mar 24 '21 06:03 cszhangzhen

Thanks for the reply :)

For bool arguments in Python's argparser, you can find an explanation here.

lygztq avatar Mar 24 '21 13:03 lygztq

Hi,

I have tested the code.

python main.py --sample_neighbor False

And, the following code is inserted into the main.py.

print(args.sample_neighbor)

It'll output the False statement.

cszhangzhen avatar Mar 24 '21 15:03 cszhangzhen

Is it possible to reproduce the results on Proteins dataset with the latest torch_sparse version?

Anwar-Said avatar May 05 '21 06:05 Anwar-Said

Hi, I'm afraid not, since the spspmm do not support auto-grad.

cszhangzhen avatar May 06 '21 02:05 cszhangzhen

ok, thanks for your response.

Anwar-Said avatar May 06 '21 03:05 Anwar-Said

Hi, Would be glad if someone shares the best parameter setting for reproducing results on the graph classification datasets?

Anwar-Said avatar May 08 '21 07:05 Anwar-Said

@cszhangzhen @lygztq @Anwar-Said It may now be possible for the gradients to propagate through the TwoHopNeighborhood call, using something like this:


class TwoHopNeighborhood(object):
    def __call__(self, data):
        edge_index, edge_attr = data.edge_index, data.edge_attr
        n = data.num_nodes

        fill = 1e16
        value = edge_index.new_full((edge_index.size(1),), fill, dtype=torch.float, requires_grad=True)
        index = torch.sparse_coo_tensor(edge_index, value, (n,n)).coalesce()
        c = torch.sparse.mm(index, index).coalesce()
        row, col = c.indices()[0], c.indices()[1]
        index = torch.stack([row, col], dim=0)
        value = c.values()

        edge_index = torch.cat([edge_index, index], dim=1)
        if edge_attr is None:
            data.edge_index, _ = coalesce(edge_index, None, n, n)
        else:
            value = value.view(-1, *[1 for _ in range(edge_attr.dim() - 1)])
            value = value.expand(-1, *list(edge_attr.size())[1:])
            edge_attr = torch.cat([edge_attr, value], dim=0)

            data.edge_index, edge_attr = coalesce(edge_index, edge_attr, n, n, op='min')
            edge_attr[edge_attr >= fill] = 0
            data.edge_attr = edge_attr

        return data

    def __repr__(self):
        return '{}()'.format(self.__class__.__name__)

I'm still not seeing the results quoted in the paper, but values close to them, provided I use the hyperparameters given in the default main.py.

GageDeZoort avatar Dec 12 '22 18:12 GageDeZoort

@GageDeZoort Thanks for your code. I will try it. Thanks.

cszhangzhen avatar Dec 15 '22 11:12 cszhangzhen

Awesome, let me know if it helps - I'm very interested in training the structure learning component and reproducing your excellent results!

GageDeZoort avatar Dec 16 '22 14:12 GageDeZoort

@GageDeZoort Thanks so much for posting! Your fix seems to be working for me. Training is extremely slow relative to simple GCN even with similar number of parameters. Is this expected? @cszhangzhen

mperozek11 avatar Jan 13 '23 02:01 mperozek11