anserini icon indicating copy to clipboard operation
anserini copied to clipboard

Metadata alignment with Pyserini prebuilt indexes

Open lilyjge opened this issue 6 months ago • 0 comments

Pyserini's prebuilt index definitions contain more metadata than Anserini, specifically the following fields: size compressed (bytes), total terms, documents, unique terms, downloaded. Anserini's prebuilt indexes don't have these fields. Given that Pyserini ports over some prebuilt indexes from Anserini directly, this means the indexes ported over are missing these fields compared to the 'native' Pyserini indexes. Alignment, i.e., adding these metadata fields to Anserini, would be nice.

lilyjge avatar Jun 23 '25 13:06 lilyjge