pynndescent icon indicating copy to clipboard operation
pynndescent copied to clipboard

Memory corruption when using alternative algorithm

Open pavlin-policar opened this issue 5 years ago • 0 comments

In some cases (I haven't been able to find a pattern for this), the library can fail with memory corruption. I am able to get this consistently with the following code:

>>> import numpy as np
>>> from pynndescent import NNDescent

>>> x = np.genfromtxt("mouse_sample_1500.txt", delimiter=",")
>>> index = NNDescent(x, algorithm="alternative")
Works fine!

>>> x = np.genfromtxt("mouse_sample_100.txt", delimiter=",")
>>> index = NNDescent(x, algorithm="alternative")
Works fine!

>>> x = np.genfromtxt("mouse_sample_1000.txt", delimiter=",")
>>> index = NNDescent(x, algorithm="alternative")
double free or corruption (out)
abort (core dumped)  python

The data in question is a small dense (1000, 50) matrix. Bizarrely, a smaller (100, 50) and a larger (1500, 50) matrix work perfectly fine. I can consistently replicate this with the files attached below.

mouse_sample_100.txt mouse_sample_1000.txt mouse_sample_1500.txt

I created an empty conda environment with python=3.6.7. I installed numpy and pynndescent using pip:

> pip freeze

certifi==2018.10.15
llvmlite==0.26.0
numba==0.41.0
numpy==1.15.4
pynndescent==0.2.1
scikit-learn==0.20.1
scipy==1.1.0

This does not occur using algorithm="standard".

pavlin-policar avatar Dec 08 '18 15:12 pavlin-policar