Repeated calls to Document::decompress

Open lgrz opened this issue 5 years ago • 0 comments

It is not clear how much extra CPU is used on repeated calls to document decompress. See the following inner loop in extract_features: https://github.com/rmit-ir/tesserae/blob/c565cda55765e8491cb184439d8fbb296aba5d4a/src/extract_features.cpp#L580

The Document class should take a similar approach to the postings, which are decompressed each time they are fetched per query. They return a "dummy" like struct that has the decompressed information which is then de-allocated after use.

Second, the naming conventions should align with the postings by using encode / decode.

May 26 '20 08:05 lgrz