[SIP-192] Labeled Version History + Restore
[SIP] Proposal for Labeled Version History + Restore
Motivation
Some users want to be able to roll back and have a process to save states of their virtual datasets or other assets.
It was mentioned here also https://github.com/apache/superset/issues/30436 that some people were talking about opening a SIP already, not sure if i missed it, but I couldn't find it after looking for one, so I'll assume it wasn't done and people still want something like this
Proposed Change
In each of the Dashboard/Chart/Dataset edit properties modal, I plan to have a version history button at the bottom:
Once clicked, It will show all the versions that you have saved and provide an option for you to save a new one:
If you chose to save a new one you get this popup and a choice to save a named version:
Clicking on a version, you get the option to restore:
Here are some end to end examples I've recorded
Datasets:
Charts:
Dashboards:
Not shown work that I'd still probably need to do would be deleting of versions or having a # of versions cap per asset (dashboard, chart, dataset) perhaps configurable in the config.
This uses the import/export stuff that already exists for now with a few modifications.
Some extra notes about breaking changes between "Version history". There are cases which I have come up with that cause things to look broken (intended). My approach and thought is that if the asset was in the version restored, would it have broken with changes anyways?
For example, we have a chart version say 0, and now it's changed a whole bunch but we need to bring back chart 0. In the current state, it uses Dataset 1*, but it used to use Dataset 1 (Changes were made to dataset1, that were breaking). If we restore the dataset to version 0, it is no longer compatible with the dataset, and breaking, however, had the chart been left in first place, it would have been broken anyways. This thinking applies here:
Reverting dataset can change Charts. There is a warning when doing it manually, same thing applies. Should you revert a dataset to a breaking version, the charts that relied in the current state will break too.
Chart reverts will ALSO appear on dashboards. Makes sense, but something users should consider when reverting, they're basically changing the same chart change
Dashboard is where it gets a little more intricate and perhaps users will experience the most unexpected behaviours while still following the design of "reverts will be as if no edits were made": The charts will not be the ones saved on current versions. I can think of 2 main odd scenarios.
- Chart is deleted. It will show up on the dashboard with the deleted chart error. If any dashboard's chart is deleted, it'll show the same thing anyways. If the dashboard was never modified and the chart was deleted, it would show the same error
- Changed charts. A chart that was changed from when the version was made to the time it was restored, it would show the new changed one. Again, if the dashboard was never modified, the chart would appear also changed.
New or Changed Public Interfaces
Probably will need at least a new set of endpoints to for the endpoints. Listing version, Saving Versions and Restoring Versions:
@expose("/<asset_type>/<int:asset_id>/save", methods=["POST"])
def save_version(self, asset_type: str, asset_id: int) -> Response:
@expose("/<asset_type>/<int:asset_id>/list", methods=["GET"])
def list_versions(self, asset_type: str, asset_id: int) -> Response:
@expose("/<asset_type>/<int:asset_id>/restore", methods=["POST"])
def restore_version(self, asset_type: str, asset_id: int) -> Response:
Sort of depending on the implementation, we'd probably need a version model to have a clean implementation for accessing this stuff. I know some people want this stored on git(??) more about this in Migration Plan and Compatibility.
For deployment, I think if we end up having a configurable max cap per asset, probably a variable needs to be set with a default for deployment? IF GIT IS USED, a token and a repo should be provided.
New dependencies
Git packages? depends on what people want and what makes the most sense
Migration Plan and Compatibility
I think ideally a new table should be creating with the versions: | Asset_Type | Asset_ID | Version_number | Description_of_version
The DB upgrade with create the table and since it's use solely for the history, downgrading will just be deleting the table. Since nothing will use the table I dont think deleting the table on downgrade will affect anything.
HOWEVER, the git storage would provide users with a way to edit and create versions outside of superset, and have a full approval process system where they could have a supervisor or something review changes to datasets. I dont know if this is the best thing to do, but it's on the table when discussing with other people
Gifs but the MP4 format:
https://github.com/user-attachments/assets/15891a84-0c86-4c9f-8721-a119a5b01419
https://github.com/user-attachments/assets/8aee43a2-fea9-4796-9312-105989ba9b92
https://github.com/user-attachments/assets/a4b1c0b1-d5e7-4193-b5a7-6bdc4f01566f
wow this would be a god send, we have been looking for this for a while
Please follow the SIP process outlined in SIP-0
@sadpandajoe is it just the mailing list?
If you haven't already, you can subscribe to the Apache Superset Dev Mailing Listserv ([email protected], with public visibility here) by sending an email to [email protected].
@ethan-l-geotab feel free to move to [DISCUSS] status after you have emailed the thread.
I emailed the thread, I dont think i can move to discuss tho. is there an appetite for this? I thought there might have been seeing how people mentioned it with the archive SIP a while back.