notes for schema dissolve
README
current state
- only allowed to create / insert into sandbox and model_draft
- in code: usually checked with UNVERSIONED_SCHEMAS | PLAYGROUNDS
- finalized data: move from model_draft to TARGET (only via API)
- Table model references Schema model instance
- in security settings
- DEFAULT_SCHEMA = 'sandbox'
- PLAYGROUNDS = (DEFAULT_SCHEMA, 'test')
- UNVERSIONED_SCHEMAS = ("model_draft", )
- in dataedit/views.py:
- schema_whitelist, schema_sandbox
desired state
- ~newly created tables all in database schema dataset~
- ~is_draft is an attribute~
- newly created tables in model_draft
- tables can be identified by their name alone
- there is a api + UI function to mark table as no longer draft
- this moves the table into (versioned?) schema dataset
- should there also be a reverse?
- you can delete all
- you can only edit in draft
- we have now "topics" that replace the schemas
- table can be in multiple topics
- tables not in draft are always in
datasettopic - in platform:
- if model_draft: do not show in tocics
problems
- lots of potentially dangerous migrations, and we dont have a (good) access to backend
- what are the security settings on the server (DEFAULT_SCHEMA, ...)
- how excactly does the versioning work (and also, what about the tables like
_edit_baseand so on`?
IMPORTANT NOTES for productive release
- change in securitysettings:
# schema for user testing
SANDBOX_SCHEMA = "sandbox"
# schema for newly created, unfinished data
DRAFT_SCHEMA = "model_draft"
# schema for finialized datasets
DATASET_SCHEMA = "dataset"
- make sure security settings are updated (schema names, see default file)
- run
python manage.py migrateBEFOREpython manage.py alembic upgrade head, because topics have to be created first
DEBUGGING
- to switch back and forth between migrated / not migrated for testing:
downgrade
python manage.py alembic downgrade 3c2369dfcc55
python manage.py migrate dataedit 0029
upgrade
python manage.py migrate
python manage.py alembic upgrade head
Because this was a question - this is what is currently in the settings on the production OEP:
DEFAULT_SCHEMA = "sandbox"
PLAYGROUNDS = ('sandbox', 'model_draft')
UNVERSIONED_SCHEMAS = ('model_draft', )
@wingechr I'm trying to figure out the best way to display the schemas/topics. Maybe we shouldn't show the incomplete data to everyone, but only to the creator and members of an assigned authorisation group. I think what I describe below is also what you have in mind?
When a user uploads data, it is in the physical model_draft schema in the database.
- Question: Do we really want to display this data all the time? Perhaps it should only be publicly visible when metadata is available and open peer review has been requested? Otherwise we could show data without licence information. We could potentially be sued for publishing this data. In this way, the user could also decide when the data is ready for review.
All published data is stored in the dataset schema in the database and assigned to a topic.
- All tables in datasets must have a topic.
- We do not want to show a topic called "datasets" but the topic names and the tables related to that topic
- The user can use a profile page (perhaps a more visible page) to manage all the user's data. The user can also publish data and select a topic. This is only possible if the open peer review process has been successfully completed.
Btw. the publish functionality is already implemented. Once your table was reviewed the user sees a publish button when is visits the profile page.