sherlock-project
sherlock-project copied to clipboard
gensim version
Hi, I'm using gensim 3.8.0 but still gets this error when I run extract_features().
'Doc2Vec' object has no attribute 'neg_labels'
Is there a way to avoid this?
That's strange.
I found this quote this on an SO post:
that error resembles a very-old bug which only showed up if Gensim was not fully installed to have the necessary Cython-optimized routines for fast training/inference operations. (That caused some older, seldom-run code to be run that had a dependency on the missing neg_labels. Newer versions of Gensim have eliminated that slow code-path entirely.)
Would you mind bumping the dependency version to gensim 3.8.3 and trying again? According to the release notes, there's a particular fix, "Fix missing C extensions", which might help given the above quote.
I just tested gensim 3.8.3
I had to edit sherlock/features/paragraph_vectors.py
and comment out the line:
assert gensim.models.doc2vec.FAST_VERSION > -1, "This will be painfully slow otherwise"
I ran the notebook 01-data-preprocessing.ipynb
from a fresh clone of the repo. It ran cleanly and there was no observable impact on the performance despite the scary message from the assertion we just disabled.
Thanks for reporting this issue @engineersunny, and testing and sharing your solution @lowecg. @engineersunny did this solution work for you?
@lowecg I did not encounter this issue myself, do you know if it this is a general issue (e.g. it happens in the extract_features_to_csv()
function as well)? Do you think we should comment this line by default?
@madelonhulsebos I got that assertion error each time when I ran the code as well so I commented out earlier but still getting the error message above. It might be a gensim version issue as I couldn't install the old version on my machine.