Move Arrow code to PyCapsule / C API
We rely on an unstable C++ ABI for pyarrow (https://github.com/Point72/csp/tree/main/cpp/csp/python/adapters/vendored/) for historical reasons. This does not work in all circumstances (e.g. when we built mac wheels with gcc, we were incompatible with pypi-provided pyarrow which is compiled with clang), but we mostly get away with it.
We should move to the PyCapsule / C API.
Here is a useful example (albeit incomplete):
- https://github.com/timkpaine/arrow-cpp-python-nocopy/blob/main/src/apn-python/cpython.h
- https://github.com/timkpaine/arrow-cpp-python-nocopy/blob/main/src/apn-python/cpython.cpp
- https://github.com/timkpaine/arrow-cpp-python-nocopy/blob/main/src/apn-python/common.cpp
Updates: We no longer rely on an unstable C++ ABI, but we do incur a full copy here: https://github.com/Point72/csp/blob/dc7426c08eaee22713b3ce70b99d5bf14dc801df/csp/adapters/parquet.py#L119
Updates: We no longer rely on an unstable C++ ABI, but we do incur a full copy here:
Line 119 in dc7426c
def _arrow_in_memory_table_to_buffers(cls, gen, startime, endtime):
We should create 2 different issues:
- For removing vendored code
- For using the pycapsule API (to avoid the full copy)
We should create 2 different issues:
- For removing vendored code
- For using the pycapsule API (to avoid the full copy)
This was already done, so this issue is just (2)