relational-gcn icon indicating copy to clipboard operation
relational-gcn copied to clipboard

Using node features (featureless=False in first conv-layer)

Open JulianNeuberger opened this issue 4 years ago • 0 comments

Hi @tkipf,

first of all, thank you for your work on GCNs, I'm currently researching their application in my domain and really like the results so far.

Sadly I'm stuck with a problem I'm not sure how to solve. I'm trying to apply the rgcn in the following setup: There are multiple, relatively small graphs of variable size, which contain directed edges and nodes with an optional feature vector (*). Those graphs have to be classified into 2 separate classes, which I do by introducing a global node -- the "hacky" solution you proposed in Issue #4 on the original gcn code. If I use your code in "featureless" mode, everything works pretty well and I get about 80% accuracy. I suspect that I can improve that by using the node features from above.

As soon as I change the featureless flag in the first gc-layer, the net won't learn an "estimator" anymore, but instead a constant output regardless of input (the ratio between target classes to be precise).

I did some digging to figure out where I went wrong and saw that you use the square identity matrix as dummy features in the featureless case. I then set featureless=False and passed in the square identity matrix, which resulted in roughly the same ~80% accuracy on the global node. But if I change the identity matrix to something like a matrix of the same dimensions, but with ones in the first column instead of in the diagonal, training fails again. Did I miss an assumption you made in your paper? Are feature vectors simply not allowed and instead I have to use scalars along the diag of X?

I realize that you get a lot of questions regarding your code and work, but I'd be very grateful for any hints or ideas.

Cheers, Julian

(*) Those optional features are latent vectors from an embedding, but sometimes there is nothing to embed. I solved that by using the zero vector in the latter case. Maybe there is a better way?

JulianNeuberger avatar Nov 03 '20 14:11 JulianNeuberger