rdflib-hdt icon indicating copy to clipboard operation
rdflib-hdt copied to clipboard

Save the rdflib graph as hdt format

Open meisyarahd opened this issue 5 years ago • 5 comments

Is it possible to load my RDF file with rdflib and convert it into an hdt document, and eventually save it as a file?

meisyarahd avatar Sep 29 '20 07:09 meisyarahd

Hi @meisyarahd,

this is not supported with RDFLib. The model how rdflib works is incompatible with the unmutable nature of hdt-files. Therefor this is not supported. To convert to hdt you are better served using one of those tools: http://www.rdfhdt.org/downloads/

FlorianLudwig avatar Oct 05 '20 15:10 FlorianLudwig

Wouldn't it be possible to implement it as a serialization, just similar to turtle?

white-gecko avatar Oct 05 '20 15:10 white-gecko

@white-gecko yes, that would be possible.

To be honest, for me this use-case is still not really interesting. Generating hdt is more resource intensive than other formats as it is heavily optimized towards reading - not writing. To generate a hdt the full graph must be sorted. As this is quite memory intensive I would always prefer to use the pure c++ implementation and avoid any overhead to generate the hdt.

And for small graphs - where the overhead does not matter - hdt might not be interesting at all.

This is not meant as a "this is a bad idea" but more as a warning to keep memory usage in mind if someone wants to take this on :)

FlorianLudwig avatar Oct 05 '20 17:10 FlorianLudwig

Can we not merge https://github.com/Callidon/pyHDT/pull/4? I think it's up to the user of RDFlib to think about memory, with probably the exception of threading / multiprocessing.

mielvds avatar Apr 20 '21 09:04 mielvds

As a side-note, I found it relatively easy to integrate HDT-CPP via docker into Python. Just create a turtle tempfile and hand the result over to HDT-CPP.

chiarcos avatar Mar 14 '22 16:03 chiarcos