SPTAG
SPTAG copied to clipboard
OverflowError in method 'AnnIndex_AddWithMetaData' when running test script
Describe the bug OverflowError: in method 'AnnIndex_AddWithMetaData', argument 4 of type 'int' when running the Python test script from "Get Started"
To Reproduce Create a new Python file in the Release directory, copy&paste the Python test script from "Get Started".
Additional context My environment: Ubuntu 18.04, Kernel 4.15.0 x86_64 Tried swig 3.0 & 4.0 cmake 3.14.4 gcc 7.4.0 python 3.6
I have the same error in OSX and python 3.7.3.
This is a very strange error. I looked into the swig generated CoreInterface_pwrap.cpp, and everything seems to be fine.
Also getting the same error. Here's some context including the stack trace
[1, 101, 3]
[10.0, 10.0, 90.0]
[1, 3, 101]
[10.0, 10.0, 10.0]
[3, 5, 103]
[10.0, 10.0, 10.0]
AddWithMetaData.............................
Setting NumberOfThreads with value 4
Setting DistCalcMethod with value L2
Traceback (most recent call last):
File "/data/test.py", line 90, in <module>
Test('BKT', 'L2')
File "/data/test.py", line 83, in Test
testAddWithMetaData(None, x, m, 'testindices', algo, distmethod)
File "/data/test.py", line 56, in testAddWithMetaData
if i.AddWithMetaData(x.tobytes(), s, x.shape[0]):
File "/app/Release/SPTAG.py", line 143, in AddWithMetaData
return _SPTAG.AnnIndex_AddWithMetaData(self, p_data, p_meta, p_num)
OverflowError: in method 'AnnIndex_AddWithMetaData', argument 4 of type 'int'
I have the same issue with ubuntu 18.04, swig 4.0, cmake 3.14.4 and python 3.6.0. it also occurs when trying to build with docker as well
def testBuildWithMetaData(algo, distmethod, x, s, out):
i = SPTAG.AnnIndex(algo, 'Float', x.shape[1])
i.SetBuildParam("NumberOfThreads", '4')
i.SetBuildParam("DistCalcMethod", distmethod)
shape = float(x.shape[0])
print(type(shape))
if i.BuildWithMetaData(x.tobytes(), s, shape):
i.Save(out)
def Test(algo, distmethod):
x = np.ones((n, 10), dtype=np.float32) * np.reshape(np.arange(n, dtype=np.float32), (n, 1))
q = np.ones((r, 10), dtype=np.float32) * np.reshape(np.arange(r, dtype=np.float32), (r, 1)) * 2
m = ''
for i in range(n):
m += str(i) + '\n'
print ("Build with metadata.............................")
testBuildWithMetaData(algo, distmethod, x, m, 'testindices')
if __name__ == '__main__':
Test('BKT', 'L2')
returns the same error argument 4 of type 'int'
as:
def testBuildWithMetaData(algo, distmethod, x, s, out):
i = SPTAG.AnnIndex(algo, 'Float', x.shape[1])
i.SetBuildParam("NumberOfThreads", '4')
i.SetBuildParam("DistCalcMethod", distmethod)
shape = int(x.shape[0])
print(type(shape))
if i.BuildWithMetaData(x.tobytes(), s, shape):
i.Save(out)
def Test(algo, distmethod):
x = np.ones((n, 10), dtype=np.float32) * np.reshape(np.arange(n, dtype=np.float32), (n, 1))
q = np.ones((r, 10), dtype=np.float32) * np.reshape(np.arange(r, dtype=np.float32), (r, 1)) * 2
m = ''
for i in range(n):
m += str(i) + '\n'
print ("Build with metadata.............................")
testBuildWithMetaData(algo, distmethod, x, m, 'testindices')
if __name__ == '__main__':
Test('BKT', 'L2')
despite converting argument 4 to float in the first example
Same here, running the docker version
This issue only appears in python3. After adding m = m.encode() before BuildWithMetaData and AddWithMetaData, this error will disappear...
@MaggieQi doesn't work in Python 3.6 - the interpretor reports problems with argument 4 which is the p_num. I doubt doing anything with the metadata, which is argument 3 would fix it - unless the error is related to argument 3 buffer overflowing and writing to argument 4....
I think the SPTAG developers team has made a very bad decision exposing the library using C++ classes. SWIG has limited support for C++ and this may result in other problems - not to mention using a C++ class reference as parameter (ByteArray)...
The m.encode() trick seems to be working.
As to how the metadata is represented in the C++, there seem to be issues of overflows when the strings used are too long. I was getting seg faults all over the place with strings longer than a few hundred characters.
Tried multiple times, still doesn't work for me. What's your enviornment?
I think it's better to change the C++ implementation of AddWithMetaData to take raw char* buffers. Swig might be confused with the different constructors of ByteArray.
You are probably right about it being better to fix the underlying problem.
I am on
- OSX 10.14.5
- Python 3.7.3 (the one in brew)
- cmake 3.14.3
- gcc 9.1.0
- swig 4.0.0
- most recent master branch as of writing.
It is a good suggestion! I will try to remove the ByteArray from the Wrapper part.