node2vec icon indicating copy to clipboard operation
node2vec copied to clipboard

support python 3+ on learn_embedding method.

Open ghost opened this issue 5 years ago • 6 comments

learn_embeddings needs minor modification to accept the python3+ 's map function. implemented below

def learn_embeddings(walks): ''' Learn embeddings by optimizing the Skipgram objective using SGD. ''' #print(type(walks)) #walks = [map(str, walk) for walk in walks] walks= [str(j) for i in walks for j in i]

ghost avatar Jan 04 '20 12:01 ghost

Actually that alone does not fix the issue with Python3 implementation. Above solution I offered has to be changed to the following.

def learn_embeddings(walks): ''' Learn embeddings by optimizing the Skipgram objective using SGD. ''' walks = [str(walk) for walk in walks]

model = Word2Vec([walks], size=args.dimensions, window=args.window_size, min_count=1, sg=1, workers=args.workers, iter=args.iter) 	
model.wv.save_word2vec_format(args.output)

print('vocalublary', list(model.wv.vocab), 'length', len(list(model.wv.vocab)))

return

ghost avatar Jan 07 '20 02:01 ghost

why you add '[]' for variable 'walks' again? this mean get walk list embedding instead of node in walks?

wen-fei avatar Feb 01 '20 12:02 wen-fei

fix at #35

wen-fei avatar Feb 01 '20 13:02 wen-fei

hi, thanks your modification! But in my try, walks = [str(walk) for walk in walks] doesn't work, and the final embeddings are of random walks rather than nodes. In my opinion, only list() is needed. Maybe, you can try this walks = [list(map(str, walk)) for walk in walks]. Meanwhile, '[]' can be removed.

liun-online avatar Feb 04 '20 03:02 liun-online

can change this sentence to walks = np.array(walks, dtype=str).tolist()

the origin map function aims to change the type of each element into str

RuYunW avatar Dec 03 '20 07:12 RuYunW

The following works fine on python3:


def learn_embeddings(walks):
	'''
	Learn embeddings by optimizing the Skipgram objective using SGD.
	'''
	walks = [list(map(str, walk)) for walk in walks]
	model = Word2Vec(walks, size=args.dimensions, window=args.window_size, min_count=0, sg=1, workers=args.workers, iter=args.iter)
	model.wv.save_word2vec_format(args.output)

	print('vocalublary', list(model.wv.vocab), 'length', len(list(model.wv.vocab)))
	
	return

shoegazerstella avatar Feb 09 '21 09:02 shoegazerstella