ADD export course data functionality
Even though we have hardcore reoccurring backups running on the cluster, we should make class data accessible to admins.
Overview of feature
Basically only myself and Teo will have access to the actual Anubis database in prod. If I was a TA for a course using Anubis, I would want to be able to generate a backup of the class data through the admin panel.
Format of exports
These should probably be zip archives as they are the most consistent across platforms (linux, osx, win...).
- We could do an SQL dump of the class data in the database but that may be a bit tricky to generate.
- Alternatively we could generate a json file, or multiple with class data. This would be easiest to generate, but maybe the most difficult to understand from a course admin perspective
- Something we should consider, but probably not actually do is dumping to csv's or excel files. We can certainly do that, but it will likely be even more difficult of a format to parse than json
Optimizations I see
Most of the class data is super reasonable in size. Where things get tricky is with the stdout of builds and tests. We could make this a option that is by default off as to if to include them in the backup. A tooltip next to that switch should probably warn that enabling will blow up the size of the export.
Another table that is unclear if we should include is the static files. Maybe we can put the metadata in the json dump, then save the files in the archive. This would greatly reduce the size of the data file for the static file table.
Flow
Generating class data exports are going to take a long time no matter how well we optimize. The way we should handle this is doing the exports asynchronously in a rpc job, and storing the results to be retrieved later.
I see there being a separate tab in the admin panel for exports. We can then have a table of generated class data export archives. We have a button on the page similar to the static file page. We can have the option for including the extra columns that will blow up the size of the archive.
When the export button is clicked, we launch that async rpc job to generate the backup. The rpc worker will pick up the job, and use a temporary directory to build up the backup, zip it up, ship it off, then clean up the temporary directory.
All while this is happening, the user will see a entry in the table added for the export, and a pending status. We can poll the api for updates on the export for like 60-120 seconds. When the export is finished, we give them a link they can download it at.
Export storage
We cannot put these export archive blobs on the StaticFile table. Right now static files are always public. It takes no authentication to get them. This is fine for what it is meant for (hosting things like images and lecture pdfs), but it is way way too open for sensitive data. We will either need to add a table just for exports, and endpoints to match (that do the necessary authentication checks), or we can add something like a is_export field to static file. We could use this field in the existing static file get endpoint for authenticating admins if it is set to true.