xg-old icon indicating copy to clipboard operation
xg-old copied to clipboard

Indexing small graphs blows up when including many paths

Open edawson opened this issue 8 years ago • 2 comments

I have a relatively small (34MB) graph that when attempting to index uses over 16GB of RAM and is subsequently killed on my machine. @ekg thinks this may be caused by inefficiencies in storing small paths. As paths are stored as bit vectors over the entire graph (rather than storing them with a start and end position), short paths still take up O(length of graph) space, so having many of them in one graph can invite chaos.

edawson avatar Jan 28 '16 15:01 edawson

How many paths are in your tiny graph? Is there a test case we could attach?

ekg avatar Jan 28 '16 15:01 ekg

@adamnovak's implementation of succinct threads is a preferable way to do this. Construction is still linear in the number of threads, which isn't ideal, but it will help a lot in terms of memory usage. What needs to be done is to specify a set of positional paths which should be indexed using the O(1) technique, which is expensive and not compressible. The rest can still be queried or converted into positional paths as needed.

ekg avatar Feb 25 '16 17:02 ekg