pygcn
pygcn copied to clipboard
Could I get the node embedding?
Hi,
Could I get the node embedding with a certain length based on this code? I should extract the output of which step?
Thanks,
You can extract and examine any hidden layer activation and check whether it is useful as some form of graph embedding. If you train in a supervised way, then these embeddings will be very specialized for the task that you trained the model for, of course. If you want unsupervised embeddings, have a look at my code for graph auto-encoders: https://github.com/tkipf/gae
On Sun 4. Nov 2018 at 23:03 Zhiqiang ZHONG [email protected] wrote:
Hi,
Could I get the node embedding with a certain length based on this code? I should extract the output of which step?
Thanks,
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tkipf/pygcn/issues/26, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYKZtj3ZvnvnX0X33OFbSF6Fhcvr_ks5ur2QugaJpZM4YNiO4 .
Excuse me that I'm training in a supervised way. It seems that GCN could only extract embeddings with the same length as num-class if I want to use the output of the last GCN-layer? Not available for a random defined range.
In this case it’s best to simply take the embeddings just before doing the last linear projection to the softmax logits. In other words, if the last layer is softmax(AHW), take either the embedding H directly or A*H.
On 5 Nov 2018, at 10:00, Zhiqiang ZHONG [email protected] wrote:
Excuse me that I'm training in a supervised way. It seems that GCN could only extract embeddings with the same length as num-class if I want to use the output of the last GCN-layer? Not available for a random defined range.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tkipf/pygcn/issues/26#issuecomment-435800626, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYCI1VOTbDa4kclXdOmlJfNyPNgsKks5ur_5IgaJpZM4YNiO4.
It's clear now. Thanks.
I used the intermediate result H, but it seems the embeddings cannot be used for link prediction task, i.e. there is little difference between the dot product of the nodes with and without connection. It is understood because gcn is originally used for node classification instead of link prediction. Does anyone have an idea how to apply the embeddings for link predictions?
The dot product will not be a good scoring function on embeddings trained
solely for classification. You can either use the embeddings from
github.com/tkipf/gae which are optimized for dot-product scoring (link
prediction), or you train a bilinear scoring function on top of the fixed
embeddings (taken from the supervised GCN model). A bilinear scoring
function looks like this: \sigma(h^TWh)
where h are embeddings for nodes,
W is a matrix that you train via gradient descent on some training data for
link prediction and sigma is a sigmoid activation function.
On Sat 1. Dec 2018 at 22:55 Zhenfeng [email protected] wrote:
I used the intermediate result H, but it seems the embeddings cannot be used for link prediction task, i.e. there is little difference between the dot product of the nodes with and without connection. It is understood because gcn is originally used for node classification instead of link prediction. Does anyone have an idea how to apply the embeddings for link predictions?
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/tkipf/pygcn/issues/26#issuecomment-443479173, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYPEud18_Br-D0RtXkuAVztNVyZKKks5u008vgaJpZM4YNiO4 .
The dot product will not be a good scoring function on embeddings trained solely for classification. You can either use the embeddings from github.com/tkipf/gae which are optimized for dot-product scoring (link prediction), or you train a bilinear scoring function on top of the fixed embeddings (taken from the supervised GCN model). A bilinear scoring function looks like this:
\sigma(h^TWh)
where h are embeddings for nodes, W is a matrix that you train via gradient descent on some training data for link prediction and sigma is a sigmoid activation function. … On Sat 1. Dec 2018 at 22:55 Zhenfeng @.***> wrote: I used the intermediate result H, but it seems the embeddings cannot be used for link prediction task, i.e. there is little difference between the dot product of the nodes with and without connection. It is understood because gcn is originally used for node classification instead of link prediction. Does anyone have an idea how to apply the embeddings for link predictions? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYPEud18_Br-D0RtXkuAVztNVyZKKks5u008vgaJpZM4YNiO4 .
Can you please make it clear that how to adapt W matrix for link prediction tasks? To my understanding, we need the adjacency matrix of the last layer to check the linkings. But in the code 'adj' will not change.
Thanks.