v2.0.0 Results Export
Overview
About as ready as I can get it - there is a chance some of the S3 stuff fails in prod, as I can't test that locally/on staging, but otherwise I have tested this locally and I'm generally happy with the shape of it.
This PR does the following:
- adds support for different versions of the results exports, including marking versions and end-of-life dates
- adds v2 of the results export, including changes to schema, readme, db configs
- generate results dumps for all "live" (non-EOL) versions
- updates the /export/results page to feature v2 data, point at v2 links, and explain the deprecation
- maintains the existing v1 permalinks until 2026-01-01, so existing tools should not be affected
- endpoint changes are supported by tests
I've run it locally and I'm happy that it does what it should - with the caveat that I haven't exhaustively tested the output data beyond general sanity checks of whether the right tables are present and that the shape of their data is accurate.
TODO
-
[x] v2 export Changes
- [x] Drop value1-5 from
results - [x] Include
result_attempts - [x] Figure out where
PUBLIC_RESULTS_VERSIONis used, and determine if- [ ] it should be v1 or v2
- [ ] if we need a second constant for the one
- [x] Drop value1-5 from
-
[x] Export job
- [x] Add a v2 export schema
- [x] Generate the v2 export like at all
- [x] Should we change the name of the dump config? Or can we use results export for both?
- [x] Only generate the v2 export after 2026-01-01
-
[x] Cronjob changes
- [x] Generate v2 as part of the result exports - figure out if we want to do both as part of one command, or do them separately. Separately is probably better from a job management perspective? At least, they should be separate jobs, but triggered by the same cron entry point, and that entry point can be updated to remove v1 from it
-
[x] Page updates
- [x] Everything should be presented with v2 being the new default
- [x] Format change explanation - v1 support will be discontinued on 2026-01-01, devs are encouraged to update before then
- [x] Link to mailing list
-
[x] Update the README
- [x] Versioning information
- [x] Subscribe to mailing list
- [x] Updated info re structure of tables
-
[x] Link changes
- [x] Have a new link to the v2 file
- [x] Raise an error in the v1 link from and including 2026-01-01
Rollout process
- Get this PR done so that we have a reference for the v2.0.0 format available on the website
- Communicate the impending change to developers
- Announce on all relevant platforms
- Reach out specifically to devs who maintain popular tools to ensure they're aware of the incoming change
- v1 will remain default until and including 2025-12-31
- We will increment to v2.0.0 on 2026-01-01, and stop providing v1.0.0 at this time
- There is an argument for continuing to provide v1.0.0 until we have the first H2H results in the db (at this point we lose backwards compatibility) - but I prefer making the breaking change earlier, provided we are able to give ~1 month of notice
You need a whole new results_dump_v2_schema file that is loaded into the database
You need a whole new results_dump_v2_schema file that is loaded into the database
Thanks for letting me know! What's the point of this schema?
It's the "starting point" for the dump routine. Our database dumps (results dump and dev dump) follow the same high-level flow, which goes like this:
- Create a new database
- Import some schema on that database
- Run the
database_dumperroutines - Run
mysqldumpon the database you just filled - Zip and upload
In essence, the huge SANITIZER hashes in database_dumper are only running the INSERT INTO statements table by table. But they need a CREATE TABLE structure to insert into, and that's what the schemas in config/database.yml are doing.
@thewca-bot deploy staging
@thewca-bot deploy staging
@thewca-bot deploy staging
@thewca-bot deploy staging
@thewca-bot deploy staging
@thewca-bot deploy staging