pynndescent icon indicating copy to clipboard operation
pynndescent copied to clipboard

explicitly pass module name to namedtuple declarations

Open aaronknister opened this issue 2 years ago • 2 comments

this was suggested by @kmurphy4 on github.com/lmcinnes/umap/issues/477 as a work around to pyspark hijacking collection.namedtuple

aaronknister avatar Jun 07 '22 15:06 aaronknister

I don't love to add this just because someone else has a bug, but I guess it's pretty harmless (and prevents other libraries from introducing the same issue, as unlikely as that may be).

Have you confirmed that this fixes the issue with pyspark?

jamestwebber avatar Jul 26 '22 17:07 jamestwebber

If it can fix a pyspark issue that's definitely a good thing since it is a relatively small change. I would want to ensure that this all plays nice with numba though, since numba support for named-tuples is performance critical in some places, and may not be happy with all of this (hence the desire to check). All the tests have passed, so that's deifnitely a good sign that numba is failing on the extra keyword argument. Just to be sure, however, I would really appreciate a quick performance benchmark comparison (which the tests don't check) -- so index build time comparisons, and maybe if you have time a comparison of with and without this change on, say, the fashion-mnist ann-benchmarks, or something equivalent that decently exercises the code and records performance measures.

lmcinnes avatar Jul 27 '22 15:07 lmcinnes