pytorch-lr-scheduler
pytorch-lr-scheduler copied to clipboard
Node (names | ids) in the output of the deepwalk-c?
Hey,
I've been playing with your implementation of deepwalk, and couldn't really wrap my head what is the format output of the binary file? If I read it into a numpy array how can I recover the ids of the nodes?
Do you have a suggestion how rewrite the binary output into an ASCII or UTF-8 format as below?
node_i emb^i_1 emb^i_2 ... emb^i_d
...
node_n emb^n_1 emb^n_2 ... emb^n_d
You can quite literally do np.fromfile('datafilepath', np.float32).reshape(num_vertices, dimensions)
Please note that for performance I do not use custom node_ids in the data format (i.e. nodes are renamed). You can see it in the conversion script. Node ids are preserved like sorted(initial_ids).