pisa
pisa copied to clipboard
PISA: Performant Indexes and Search for Academia
## Overview Optional compilation of dependencies allows for faster compilation (in cases the dependencies are not necessary) and possibly working around dependent libraries not compiling on a target. ## Details...
Gumbo parser relies on autotools to build it, yet we don't mention it [in the docs](https://pisa.readthedocs.io/en/latest/getting_started.html#building-the-code). Here's an example of what is needed on Ubuntu 18.04: ```bash sudo apt-get install...
This was introduced via #387 - The problem is that the documentation for TBB states that passing a parameter `n` to `max_allowed_parallelism` will result in `n-1` worker threads operating: https://software.intel.com/en-us/node/589744...
We should define it in `pisa` namespace, and while we're at it, remove the underscore from the name.
Related to #417 The progress bar causes a problem if run a long process as it produces one line per second. The usual solution to that is to detect whether...
**Describe the solution you'd like** The parse_collection command seems to have several potential areas of improvements: 1. "Remapping IDs" could be done in parallel 2. "Concatenating batches" and "Remapping IDs"...
**Describe the solution you'd like** The bitfunnel paper contained some additions to the `partitioned_elias_fano` codebase which allowed measuring queries per second with multiple threads. see here: `https://github.com/BitFunnel/partitioned_elias_fano/blob/master/Runner/QueryLogRunner.cpp#L56` It would be...
We should update the documentation https://pisa.readthedocs.io/en/latest/getting_started.html to reflect the set of compilers we test against in our CI workflow.
All available encoding values should be documented. There is somewhat incomplete documentation here: https://pisa.readthedocs.io/en/latest/compress_index.html (only mentions actual values of the parameter for some compression techniques). I think we should do...