Madelon Hulsebos
Madelon Hulsebos
That looks useful, thanks for making and sharing! The 78 semantic types that Sherlock is trained on can be found in table 19 on page 28 in [this paper](https://adalabucsd.github.io/papers/TR_2021_SortingHat.pdf). It...
You're welcome! PS, all types match a type in wikidata.
Thanks for sharing this issue! I hope you managed to make it work in the meantime, but I will look into this later. If you have a general solution, a...
Thanks @michaelmior, I manage the packages as in the `requirements.txt` with conda but did not test the 3.8 requirements. Will work on this, thanks!
Dear @stranger-codebits, Thanks a lot for reporting your issue and findings, this is a great catch. I hope to have time to look into this soon. In the meantime, feel...
Hi Nikolaos, That would be much appreciated! There are no guidelines in place, but it would be great if you could provide your solution along with some evidence showing that...
Hi! These can be generated with the notebook here: https://github.com/mitmedialab/sherlock-project/blob/master/notebooks/01-data-preprocessing.ipynb. Please let me know if this is clear and works for you! Madelon
Hi Varnit, These files can be obtained by extracting paragraph vectors again with the code in this module: https://github.com/mitmedialab/sherlock-project/blob/master/sherlock/features/paragraph_vectors.py. The process is displayed here: https://github.com/mitmedialab/sherlock-project/blob/master/notebooks/03-retrain-paragraph-vector-features.ipynb. Does that address your question?...
Hi Varnit, These additional files are generated by gensim automatically through the `.save()` method, if the model is rather large. These files are then also expected to exist upon loading...
Hi Giacomo, Thank you! - I recommend building a new paragraph vector model, but you can try if the existing works for your dataset. - Indeed, this should be done...