anserini
anserini copied to clipboard
Metadata alignment with Pyserini prebuilt indexes
Pyserini's prebuilt index definitions contain more metadata than Anserini, specifically the following fields: size compressed (bytes), total terms, documents, unique terms, downloaded. Anserini's prebuilt indexes don't have these fields. Given that Pyserini ports over some prebuilt indexes from Anserini directly, this means the indexes ported over are missing these fields compared to the 'native' Pyserini indexes. Alignment, i.e., adding these metadata fields to Anserini, would be nice.