How to get the output embedding values in unsupervised graphsage?
I want to get the output embedding vectors as features for the nodes, is there a method to get that?
Hi @jc-moon thanks for your question!
I think this section from the unsupervised GraphSAGE demo may be relevant for you: https://stellargraph.readthedocs.io/en/stable/demos/embeddings/graphsage-unsupervised-sampler-embeddings.html#Extracting-node-embeddings
The idea is that you initially train the model with an additional layer on top of the embeddings
# `x_out` has the embedding vectors
x_inp, x_out = graphsage.in_out_tensors()
# additional layer for unsupervised learning task
prediction = link_classification(
output_dim=1, output_act="sigmoid", edge_embedding_method="ip"
)(x_out)
# rest of training code
model = keras.Model(inputs=x_inp, outputs=prediction)
model.compile(...)
model.fit(...)
Then when you need to extract the node embeddings, you need to create a new keras model which has x_out as the output embeddings:
x_inp_src = x_inp[0::2]
x_out_src = x_out[0]
embedding_model = keras.Model(inputs=x_inp_src, outputs=x_out_src)
embedding_model.predict(node_gen)
The reason for using x_inp[0::2] and x_out[0] instead of just x_inp and x_out is because the original unsupervised GraphSAGE model actually takes a pair of nodes as input and output (hence why it is used with a GraphSAGELinkGenerator), whereas for our node embedding model, each row of input/output correspond to a single node.
Hope that makes sense! Reading the rest of the notebook that I've linked above may give you some additional context too, but let us know how it goes.
@kjun9 Why is the need to create an additional embedding model. Can't we simply use the existing trained model to extract the embeddings?