BlackLab icon indicating copy to clipboard operation
BlackLab copied to clipboard

Store full indexing configuration with every index?

Open jan-niestadt opened this issue 3 years ago • 1 comments

We have a server where users can create a custom indexing configuration and update data to be indexed. This server is running an older version of BlackLab and some of the indexes use the older file formats which we're dropping support for.

This might be fixed by exporting the data from the content store and re-indexing it with the new code; however, we don't know the full index config the index was generated with. (we can still convert the files from the old to the newer formats, but that will take a bit more work)

For this reason, it would probably be good to automatically store the full index format configuration file in the index directory. It wouldn't be used for anything, but just be there for future reference.

jan-niestadt avatar Feb 21 '22 15:02 jan-niestadt

(when we integrate with Solr, this would have to be added to the Lucene index as well, not a separate file)

jan-niestadt avatar Jul 04 '22 10:07 jan-niestadt

The new integrated index format will store the input format config inside the index metadata.

We will probably deal with the old AutoSearch corpora by using a proxy that forwards requests to an older BlackLab version for older indexes. Newer indexes will have the format stored, so in the future, reindexing would be a viable solution.

jan-niestadt avatar Sep 05 '22 11:09 jan-niestadt