PGM-index icon indicating copy to clipboard operation
PGM-index copied to clipboard

Embed into other languages

Open flip111 opened this issue 4 years ago • 7 comments

Hey thanks for the library. Is there any possibility for a C version of this? Then it would be much easier to embed in other languages. A C-wrapper would also be a good compromise.

flip111 avatar Jan 29 '21 17:01 flip111

Hi @flip111, thank you for your interest!

Writing a C version would take quite some time, I'd rather go with wrappers from C++ to the target language. For example, for the Python wrapper, I used the pybind11 library and wrapped manually the main class pgm::PGMIndex. Do you have in mind any particular target language?

But if we want to scale, maybe we should look at automatic wrapper generators, such as SWIG and similar tools. But they have a learning curve too.

I'll leave this issue open. Maybe someone more experienced with these tools wants to help 😉

gvinciguerra avatar Jan 30 '21 16:01 gvinciguerra

Going from C++ directly to the target language is not trivial due to ABI, that's why i was asking for a C wrapper. This doesn't need additional tools or libraries https://isocpp.org/wiki/faq/mixing-c-and-cpp

flip111 avatar Jan 30 '21 17:01 flip111

Although I agree that a C wrapper is preferable because most languages can fairly trivally adapt to C due to the standard ABI and there are a large number of possible languages, writing a C wrapper for a C++ program is not a trivial matter.

The C++ code uses language features that have no equivalent in C so there is a lot of work creating a C API that approximates those features. With things like C++ templates you have no hope and just have to only provide support for some basic instatiations, invoke some kind of C macro magic or resort to build-time hackery.

Having said that, tools like SWIG end up being a lot of work anyway and may not be worth the effort.

Personally I'd approach it by defining some minimalist C API for the entire library and build on that when and if required. I like this idea because it would make the library much more accessible to those who just want to exploit these powerful new algorithms.

darko20 avatar Jan 30 '21 17:01 darko20

I'll work on the C API in the upcoming week. Stay tuned 😉

gvinciguerra avatar Jan 31 '21 13:01 gvinciguerra

Great, I'll be ready to test it when it's done.

I use LMDB right now and I'm very interested to see how it might perform in practice with PMG indexes replacing its B+Trees. LMDB's memory mapped approach is very fast and convenient and seems complementary to this.

darko20 avatar Jan 31 '21 15:01 darko20

I pushed a first version of the C interface and some examples of usage under /c-interface/examples :)

gvinciguerra avatar Feb 02 '21 09:02 gvinciguerra

Here there's a re-implementation in Java!

gvinciguerra avatar Apr 22 '24 18:04 gvinciguerra