stellargraph icon indicating copy to clipboard operation
stellargraph copied to clipboard

How to get the output embedding values in unsupervised graphsage?

Open jc-moon opened this issue 5 years ago • 2 comments

I want to get the output embedding vectors as features for the nodes, is there a method to get that?

jc-moon avatar May 21 '20 11:05 jc-moon

Hi @jc-moon thanks for your question!

I think this section from the unsupervised GraphSAGE demo may be relevant for you: https://stellargraph.readthedocs.io/en/stable/demos/embeddings/graphsage-unsupervised-sampler-embeddings.html#Extracting-node-embeddings

The idea is that you initially train the model with an additional layer on top of the embeddings

# `x_out` has the embedding vectors
x_inp, x_out = graphsage.in_out_tensors()

# additional layer for unsupervised learning task
prediction = link_classification(
    output_dim=1, output_act="sigmoid", edge_embedding_method="ip"
)(x_out)

# rest of training code
model = keras.Model(inputs=x_inp, outputs=prediction)
model.compile(...)
model.fit(...)

Then when you need to extract the node embeddings, you need to create a new keras model which has x_out as the output embeddings:

x_inp_src = x_inp[0::2]
x_out_src = x_out[0]
embedding_model = keras.Model(inputs=x_inp_src, outputs=x_out_src)
embedding_model.predict(node_gen)

The reason for using x_inp[0::2] and x_out[0] instead of just x_inp and x_out is because the original unsupervised GraphSAGE model actually takes a pair of nodes as input and output (hence why it is used with a GraphSAGELinkGenerator), whereas for our node embedding model, each row of input/output correspond to a single node.

Hope that makes sense! Reading the rest of the notebook that I've linked above may give you some additional context too, but let us know how it goes.

kjun9 avatar May 21 '20 22:05 kjun9

@kjun9 Why is the need to create an additional embedding model. Can't we simply use the existing trained model to extract the embeddings?

kapeed1011 avatar Dec 20 '21 09:12 kapeed1011