The detector database needs to be invalidated for relevant code changes
This came out of PR #573.
@nelimee : "We need to be able to version the detector database if we want that feature [the transparent use of the detector database] to work as expected, because we need to be able to tell tqec "the database format changed, you cannot rely on previous database formats, please re-generate it from scratch"."
Previously, it would be up to the user to regenerate their own detector databases if a relevant part of the code changed and invalidated them. Now that the database is generated and used 'under the hood' by default, the code also needs to be responsible for regenerating the default database 'under the hood' when necessary.
@nelimee provided an initial list of events which should invalidate any previously computed database:
-
the DetectorDatabase class changed,
-
the _DetectorDatabaseKey class changed (e.g., how we compute its hash),
-
an existing "plaquette circuit <=> plaquette name" link has been broken (e.g., changing the implementation of a plaquette without changing its name),
Note that this can likely be quite esily fixed by defining a version attribute to the database, and rejecting any version that is not up-to-date.
That makes sense. How would you envision the version number being increased? Would it need to be increased manually by anyone making a change to the codebase that would invalidate the database? I don't love that, because it seems quite prone to people forgetting/ not being aware. But I can't see a programmatic way of doing it. This would at least be better than having no mechanism in place to invalidate the database. And if someone did forget to increase the version number in their code push it would be a quick fix once someone reported the bug of the database not working. Is this what you were thinking of? If so, I can have a go at it, if you wouldn't mind assigning me.
Right now, the only programmatic way of doing that would be to check that each plaquette name is associated to a unique circuit all over the database, which will likely incur high overheads when using the database. For that reason, I think it is better to simply bump manually the database version. I will try to have an explicit and exhaustive list of the cases that should trigger a bump in database version. Also, we might be able to test that in CI by saving the database across runs.
Ideally, it would also be nice to have a specific version number for the database for development purposes. In #562 I had to regularly remove the database (and think about doing it) in order to pass tests. It would have been nice to be able to:
- when branching, changing the database version in the branch to a special value saying "always invalidate and recompute",
- perform my changes and debugging with a stateless program (at least, there is no implicit state used by the library),
- change back the version to something more meaningful, bumping it if I made changes invalidating the database, or reverting to the version on
mainif that is not the case.
The special value can be anything in practice, but None is likely a good choice.