nodevectors icon indicating copy to clipboard operation
nodevectors copied to clipboard

w2vparams["batch_word"] default parameter cripples node2vec's performance

Open ubalklen opened this issue 4 years ago • 0 comments

The Node2Vec class constructor sets the default value of w2vparams["batch_words"] to 128. The default value in gensim's lib is 10000. According to their docs:

batch_words (int, optional) – Target size (in words) for batches of examples passed to worker threads (and thus cython routines).(Larger batches will be passed if individual texts are longer than 10000 words, but the standard cython code truncates to that maximum.)

I don't know what exactly it does behind the scenes, but using the current default value of 128 severely affects the training performance.

Line of code: https://github.com/VHRanger/nodevectors/blob/5acc519294e7501583e1b509b5b52d6430283c11/nodevectors/node2vec.py#L28

ubalklen avatar Mar 05 '21 18:03 ubalklen