oeplatform
oeplatform copied to clipboard
versioning of data
As a user, I want to access old versions of data if I accidentally entered something wrong.
This cannot be fully achieved within the current project. Generally speaking this is possible - by hand. A user interface would be too complicated. In emergency situations @MGlauer can recover old versions, but this can only be a last resort and is not meant to be a regular service so far. For the time being there should be a notice somewhere communicating as much.
RequirementSpecificationID=62
May this be a potential use-case? I upload a table that successfully went through the review. It now lives not in model_draft but in its "final" schema. This table however contains data that may periodically or sometimes need amendments (i.e. lines to be added). How can this be made possible?
@Ludee and I are currently finalizing a first data set of the kind you described @han-f here. Our idea was to handle curation in a GitHub issue. We will still need to streamline the process of implementing contributions, but the general idea seems viable.
There will be a notice describing the current state. There will be no specific interface to access different versions (yet).
Firstly, we could simply implement the advice/notice mentioned in the comments above
As this is relevant again and we have talked about versioning on several occasions.
Currently our revision system is in operation. It keeps track of all user transactions (all C-R-UD changes) and stores them in a schema called, for example, "_model_draft" if the original table is in "model_draft". If a user then creates, updates or deletes data in the table, the changes are saved as a delta. To go one step further and calculate the versions of it, we would need a system that can track the changes related to a version.
This type of solution is relatively complex, especially since our current data publishing workflow would allow for a simpler solution:
The user has to upload the data while it is in the theme / schema model_draft. For this reason, I think we could create a new version every time the user publishes the data. That is, it would be necessary to provide an "upload new version" button or something else next to a "published" table. Then we only need to track the new upload & save the previous version and don't need to calculate the deltas for each version.
This solution seems simpler to me, but may not be mature enough yet. There may also be other advantages if we choose the more complex solution.