pytorch-lr-scheduler icon indicating copy to clipboard operation
pytorch-lr-scheduler copied to clipboard

Node (names | ids) in the output of the deepwalk-c?

Open plumdeq opened this issue 7 years ago • 1 comments

Hey,

I've been playing with your implementation of deepwalk, and couldn't really wrap my head what is the format output of the binary file? If I read it into a numpy array how can I recover the ids of the nodes?

Do you have a suggestion how rewrite the binary output into an ASCII or UTF-8 format as below?

node_i emb^i_1 emb^i_2 ... emb^i_d
...
node_n emb^n_1 emb^n_2 ... emb^n_d

plumdeq avatar Dec 22 '17 13:12 plumdeq

You can quite literally do np.fromfile('datafilepath', np.float32).reshape(num_vertices, dimensions)

Please note that for performance I do not use custom node_ids in the data format (i.e. nodes are renamed). You can see it in the conversion script. Node ids are preserved like sorted(initial_ids).

xgfs avatar Jan 14 '18 18:01 xgfs