succinct-cpp
succinct-cpp copied to clipboard
Succinct C++
- Put .py files in examples - Reformatted code to Google-style format - Removed unnecessary code - Reorganized pysuccinct structs to have query and compress functions
When preprocessing large datasets, it'd be helpful to print out a few short messages indicating progress. For example, "ISA tmp file written out to disk", or "progress estimate: 33%", etc.
During construction of large datasets, if the disk is out-of-space, the temporary ISA, SA, and NPA will still be silently written, but _corrupted_ and _partially_. We should throw exceptions to...
IIRC, if an input file of N lines (with the last character being a \n) is passed to SuccinctShard ctor, num_keys() will incorrectly return N+1, treating the last key as...
Currently, there are cases where several names may refer to the same entity, and there are many names (e.g. handler / aggregator / server / shard / worker, etc.). It'd...
E.g. depending on whether the end of the file contains a '\n'. @anuragkh and I have a discussion thread off Github. Opening this issue as a reminder :)
When `len` is a positive int, and the queried key has empty value, `string::resize(-1)` will be called, and an exception occurs. A purposed fix is [here](https://github.com/concretevitamin/succinct-cpp/commit/0e4564967f5338bab24d7540d6ad11057a3de0be).
For example, in `SuccinctFile`, `extract()` takes an `uint64_t`, but `search()` puts resulting offsets into `vector`. I purpose we use `int64_t` everywhere.
The destructor should take care of cleanup of the NPA heap allocated variables.