pisa
pisa copied to clipboard
Limit memory usage during index compression
Right now, I think everything resides in main memory for the entire run, and you need a lot memory to compress the big collections like Clueweb. But it should be easy enough to refactor it to only keep in memory as many posting lists as there are threads running.