Construct a new dummy dataset & apply it to local/staging
You will find some inconsistencies between the claimed database schema in db/schema.rb and the one that is actually present in our staging environment called mampf-experimental. To compare them, use
rails db:schema:dump # (!) don't ever use db:schema:load as it will destroy all data (!!!)
in mampf-experimental and compare with the db/schema.rb from the mampf repo. For development purposes, we locally pre-seed the database with this dataset. In staging and next (mampf-experimental and mampf-dev), we use a more "advanced" dataset that is somewhat closer to our actual production data. Yet, it is inconsistent with the production dataset, e.g. some foreign keys do not exist.
I propose to proceed as follows:
- Just in case, backup mampf-dev and mampf-experimental entirely (the database), such that we might revert to it in the future if necessary for testing.
- Start with the currently available pre-seed that we already use to develop locally. With a domain expert, improve this dataset by adding more data for specific scenarios we want to cover. Compare this dataset with the one currently at mampf-experimental and try to come close to it.
- Then, preseed mampf-dev and mampf-experimental with this new dataset such that our local development and the staging environments share the same data.
If this is conceived as too cumbersome, we might as well start with the dataset in mampf-dev or mampf-experimental and fix their data inconsistencies by manually reapplying migrations or fixing outdated constraints etc. manually in the rails console. Then, use this dataset also for local development. With this approach there is of course a risk that we still have some unnoticed data inconsistencies in the end, but that one can be mitigated.
I don't know which approach would be better as it depends on what dataset we conceive as being "better" with "better" meaning better suited to reflect the actual production environment with all its complexity. A first step is just to browse MaMpf in the respective environments, log in as admin and click through many menus to get a feel for what kind of data is available or might be lacking.
We might tackle this issue progressively by new PRs to the mampf-init-data repo.
We should watch out that user data from mampf-dev and mampf-experimental might not be used since the seed data is publicly available and must not contain any sensible information like e-mail addresses etc.
See also the wiki entry What to do when the mampf-experimental db schema is broken?.
Only thing left to do is to make it available to experimental and next as well in a modified form (with non-default passwords).