Weird type conversions or naming in Faiss
Summary
Platform
OS: Linux 64 intel
Faiss version: 1.8.0
Installed from: conda
Running on: CPU Interface: Python
Reproduction instructions
The type mapping between c++ and Python integer represetnations is weitrd.
The script ttypes.py gives
(faiss_1.8.0) matthijs@devfair0459:~/src/NeuralCompressionInternal/faiss_index$ python ttypes.py
int8 -> <Swig Object of type 'char *' at 0x7fce5d7c0a50>
uint8 -> <Swig Object of type 'uint8_t *' at 0x7fce5d7c0a20>
int16 -> <Swig Object of type 'int16_t *' at 0x7fce5d7c0a80>
uint16 -> <Swig Object of type 'uint16_t *' at 0x7fce5d7c0a20>
int32 -> <Swig Object of type 'faiss::HNSW::storage_idx_t *' at 0x7fce5d7c0a50>
uint32 -> <Swig Object of type 'unsigned int *' at 0x7fce5d7c0a20>
int64 -> <Swig Object of type 'int_fast16_t *' at 0x7fce5d7c0a80>
uint64 -> <Swig Object of type 'uintmax_t *' at 0x7fce5d7c0a20>
- the int32 one is a weird naming, but is coherent
- the int64 is probably wrong
- the uint64 one is weird but probably right,
It is as if SWIG assigned a random name among all typedefs for a given type.
@mdouze would you like to provide some more context? For example, what does prompt this question?
The context in which this problematic is when you want to implement a custom SWIG wrapper for a small C++ object as described in
https://github.com/facebookresearch/faiss/wiki/Python-C---code-snippets#wrapping-small-c-objects-for-use-from-python
I tried to do it in the following code
https://github.com/fairinternal/NeuralCompressionInternal/blob/dev/faiss_index/graph_search_traced.swig
but there is no way to define uint64_t is a way that is compatible with the main Faiss' implementation (so I resorted to passing pointers via void*)
TODO is (1) fix it and (2) add a test with a small C++ wrapper that actually tests this so that we catch regressions -- and this is also quite platform dependent as uint64_t is defined differently on platforms like windows.
Started working on this in https://github.com/facebookresearch/faiss/pull/3699
Another datapoint: on the mac (conda 1.8.0), ttypes.py gives:
int8 -> <Swig Object of type 'char *' at 0x11f868480>
uint8 -> <Swig Object of type 'uint8_t *' at 0x11f868450>
int16 -> <Swig Object of type 'int16_t *' at 0x11f8684b0>
uint16 -> <Swig Object of type 'uint16_t *' at 0x11f868450>
int32 -> <Swig Object of type 'int_fast16_t *' at 0x11f868480>
uint32 -> <Swig Object of type 'uint_fast16_t *' at 0x11f868450>
int64 -> <Swig Object of type 'faiss::CMax< float,int64_t >::TI *' at 0x11f8684b0>
uint64 -> <Swig Object of type 'uintmax_t *' at 0x11f868450>
so wrong for int32, uint32