tqec icon indicating copy to clipboard operation
tqec copied to clipboard

The detector database needs to be invalidated for relevant code changes

Open BSchelpe opened this issue 8 months ago • 4 comments

This came out of PR #573.

@nelimee : "We need to be able to version the detector database if we want that feature [the transparent use of the detector database] to work as expected, because we need to be able to tell tqec "the database format changed, you cannot rely on previous database formats, please re-generate it from scratch"."

Previously, it would be up to the user to regenerate their own detector databases if a relevant part of the code changed and invalidated them. Now that the database is generated and used 'under the hood' by default, the code also needs to be responsible for regenerating the default database 'under the hood' when necessary.

@nelimee provided an initial list of events which should invalidate any previously computed database:

  • the DetectorDatabase class changed,

  • the _DetectorDatabaseKey class changed (e.g., how we compute its hash),

  • an existing "plaquette circuit <=> plaquette name" link has been broken (e.g., changing the implementation of a plaquette without changing its name),

BSchelpe avatar Apr 29 '25 19:04 BSchelpe

Note that this can likely be quite esily fixed by defining a version attribute to the database, and rejecting any version that is not up-to-date.

nelimee avatar Apr 30 '25 08:04 nelimee

That makes sense. How would you envision the version number being increased? Would it need to be increased manually by anyone making a change to the codebase that would invalidate the database? I don't love that, because it seems quite prone to people forgetting/ not being aware. But I can't see a programmatic way of doing it. This would at least be better than having no mechanism in place to invalidate the database. And if someone did forget to increase the version number in their code push it would be a quick fix once someone reported the bug of the database not working. Is this what you were thinking of? If so, I can have a go at it, if you wouldn't mind assigning me.

BSchelpe avatar May 01 '25 17:05 BSchelpe

Right now, the only programmatic way of doing that would be to check that each plaquette name is associated to a unique circuit all over the database, which will likely incur high overheads when using the database. For that reason, I think it is better to simply bump manually the database version. I will try to have an explicit and exhaustive list of the cases that should trigger a bump in database version. Also, we might be able to test that in CI by saving the database across runs.

nelimee avatar May 02 '25 08:05 nelimee

Ideally, it would also be nice to have a specific version number for the database for development purposes. In #562 I had to regularly remove the database (and think about doing it) in order to pass tests. It would have been nice to be able to:

  1. when branching, changing the database version in the branch to a special value saying "always invalidate and recompute",
  2. perform my changes and debugging with a stateless program (at least, there is no implicit state used by the library),
  3. change back the version to something more meaningful, bumping it if I made changes invalidating the database, or reverting to the version on main if that is not the case.

The special value can be anything in practice, but None is likely a good choice.

nelimee avatar May 05 '25 13:05 nelimee