getpy
getpy copied to clipboard
Just-in-time compilation for custom types
This looks like a really nice library. Looking at #7, it seems like the one place where there's room for improvement is handling types. Is it possible to look into just-in-time model for generating/compiling the C++ into a Python module? It would be something like PyTorch's torch.utils.cpp_extension.load. This would allow key and value to be any NumPy type simply by generating an instantiation of the template with an appropriate std::array<std::byte, N> template parameter where N is the size of the dtype.
One place I could see this being a problem is with #2 -- is that problem just because of the number of types, or is it intrinsic to compiling even one type?
Actually, it seems like using cppyy directly with Parallel Hashmap might be the easiest way to do this.
That would be very cool. Yes I imagine that internally converting all numpy data into the S* datatype so that it would be compatible with the C++ std::array<std::byte, N> datatype would be ideal for reducing the combinatorial complexity of compilation. I have mentioned previously that I am afraid of doing any datatype conversion/view change automagically because I do not want the dictionary to mangle any data. I'd rather it be annoying but useable all of the time than annoying and unusable some of the time.
That said, it would probably still be necessary to have JIT compilation of "weird" data sizes even if all datatypes are std::array<std::byte, N> under the hood. There are still N^2 combinations which will be too much at 1 byte intervals.
Incorporating JIT compilation is definitely beyond what I am able to do at this point, however, I agree that it is the best way to do it. I am not sure how Pybind11 and JIT compilation would work together. I have never read anything about it in the Pybind11 docs.
Adam