Gael Varoquaux
Gael Varoquaux
> should I declare optional dependencies (see below) Yes!! > If we declare optional dependencies with extras_require, then should we add python-Levensthein too ? Indeed!
I've started having a look again at this PR. I worry that we have added a 1.6Mb file to the package (the dbpedia file), which makes it much heavier than...
There are still place where python-levenshtein is remaining: I found one in .travis.yml, where I believe that it should be removed. Can you "git grep levenshtein", and check all the...
I worry that the self.hash_dict was really useful to speed things up by avoiding recomputation of repeated entries
> Do you think that's likely to happen / worth the additional memory usage ? Yes: it's very frequent. Typically, the entries are repeated many times.
> Benchmarking really quickly on my Mac M1 with 8 cores, the rows version is about twice faster than the batched version, but we should do a more serious benchmark....
Yes, we shouldn't error on clean datasets, just pass them along. I'm not sure that a warning is warranted. On Aug 5, 2022, 11:26, at 11:26, Lilian ***@***.***> wrote: >...
+1 for addition this comparison as a new section in the end. Thanks!!
Looks good to me. I'm merging
> In order to fix that, we should upgrade to SciPy 1.4.0 (released December 2019) and to NumPy 1.17.3 (released October 2019) the minimum requirement for the SciPy version. >...