faiss icon indicating copy to clipboard operation
faiss copied to clipboard

Weird type conversions or naming in Faiss

Open mdouze opened this issue 1 year ago • 5 comments

Summary

Platform

OS: Linux 64 intel

Faiss version: 1.8.0

Installed from: conda

Running on: CPU Interface: Python

Reproduction instructions

The type mapping between c++ and Python integer represetnations is weitrd.

The script ttypes.py gives

(faiss_1.8.0) matthijs@devfair0459:~/src/NeuralCompressionInternal/faiss_index$ python ttypes.py 
int8 -> <Swig Object of type 'char *' at 0x7fce5d7c0a50>
uint8 -> <Swig Object of type 'uint8_t *' at 0x7fce5d7c0a20>
int16 -> <Swig Object of type 'int16_t *' at 0x7fce5d7c0a80>
uint16 -> <Swig Object of type 'uint16_t *' at 0x7fce5d7c0a20>
int32 -> <Swig Object of type 'faiss::HNSW::storage_idx_t *' at 0x7fce5d7c0a50>
uint32 -> <Swig Object of type 'unsigned int *' at 0x7fce5d7c0a20>
int64 -> <Swig Object of type 'int_fast16_t *' at 0x7fce5d7c0a80>
uint64 -> <Swig Object of type 'uintmax_t *' at 0x7fce5d7c0a20>
  • the int32 one is a weird naming, but is coherent
  • the int64 is probably wrong
  • the uint64 one is weird but probably right,

It is as if SWIG assigned a random name among all typedefs for a given type.

mdouze avatar Jul 11 '24 12:07 mdouze

@mdouze would you like to provide some more context? For example, what does prompt this question?

junjieqi avatar Jul 11 '24 16:07 junjieqi

The context in which this problematic is when you want to implement a custom SWIG wrapper for a small C++ object as described in

https://github.com/facebookresearch/faiss/wiki/Python-C---code-snippets#wrapping-small-c-objects-for-use-from-python

I tried to do it in the following code

https://github.com/fairinternal/NeuralCompressionInternal/blob/dev/faiss_index/graph_search_traced.swig

but there is no way to define uint64_t is a way that is compatible with the main Faiss' implementation (so I resorted to passing pointers via void*)

mdouze avatar Jul 25 '24 05:07 mdouze

TODO is (1) fix it and (2) add a test with a small C++ wrapper that actually tests this so that we catch regressions -- and this is also quite platform dependent as uint64_t is defined differently on platforms like windows.

mdouze avatar Jul 25 '24 05:07 mdouze

Started working on this in https://github.com/facebookresearch/faiss/pull/3699

mdouze avatar Jul 29 '24 12:07 mdouze

Another datapoint: on the mac (conda 1.8.0), ttypes.py gives:

int8 -> <Swig Object of type 'char *' at 0x11f868480>
uint8 -> <Swig Object of type 'uint8_t *' at 0x11f868450>
int16 -> <Swig Object of type 'int16_t *' at 0x11f8684b0>
uint16 -> <Swig Object of type 'uint16_t *' at 0x11f868450>
int32 -> <Swig Object of type 'int_fast16_t *' at 0x11f868480>
uint32 -> <Swig Object of type 'uint_fast16_t *' at 0x11f868450>
int64 -> <Swig Object of type 'faiss::CMax< float,int64_t >::TI *' at 0x11f8684b0>
uint64 -> <Swig Object of type 'uintmax_t *' at 0x11f868450>

so wrong for int32, uint32

mdouze avatar Aug 09 '24 08:08 mdouze