oeplatform icon indicating copy to clipboard operation
oeplatform copied to clipboard

notes for schema dissolve

Open wingechr opened this issue 2 years ago • 5 comments

README

current state

  • only allowed to create / insert into sandbox and model_draft
  • in code: usually checked with UNVERSIONED_SCHEMAS | PLAYGROUNDS
  • finalized data: move from model_draft to TARGET (only via API)
  • Table model references Schema model instance
  • in security settings
    • DEFAULT_SCHEMA = 'sandbox'
    • PLAYGROUNDS = (DEFAULT_SCHEMA, 'test')
    • UNVERSIONED_SCHEMAS = ("model_draft", )
  • in dataedit/views.py:
    • schema_whitelist, schema_sandbox

desired state

  • ~newly created tables all in database schema dataset~
  • ~is_draft is an attribute~
  • newly created tables in model_draft
  • tables can be identified by their name alone
  • there is a api + UI function to mark table as no longer draft
    • this moves the table into (versioned?) schema dataset
  • should there also be a reverse?
  • you can delete all
  • you can only edit in draft
  • we have now "topics" that replace the schemas
  • table can be in multiple topics
  • tables not in draft are always in dataset topic
  • in platform:
    • if model_draft: do not show in tocics

problems

  • lots of potentially dangerous migrations, and we dont have a (good) access to backend
  • what are the security settings on the server (DEFAULT_SCHEMA, ...)
  • how excactly does the versioning work (and also, what about the tables like _edit_baseand so on`?

wingechr avatar Aug 16 '23 10:08 wingechr

IMPORTANT NOTES for productive release

  • change in securitysettings:
# schema for user testing
SANDBOX_SCHEMA = "sandbox"
# schema for newly created, unfinished data
DRAFT_SCHEMA = "model_draft"
# schema for finialized datasets
DATASET_SCHEMA = "dataset"
  • make sure security settings are updated (schema names, see default file)
  • run python manage.py migrate BEFORE python manage.py alembic upgrade head, because topics have to be created first

wingechr avatar Aug 16 '23 12:08 wingechr

DEBUGGING

  • to switch back and forth between migrated / not migrated for testing:

downgrade

python manage.py alembic downgrade 3c2369dfcc55
python manage.py migrate dataedit 0029

upgrade

python manage.py migrate
python manage.py alembic upgrade head

wingechr avatar Aug 16 '23 13:08 wingechr

Because this was a question - this is what is currently in the settings on the production OEP:

DEFAULT_SCHEMA = "sandbox"
PLAYGROUNDS = ('sandbox', 'model_draft')
UNVERSIONED_SCHEMAS = ('model_draft', )

jh-RLI avatar Aug 22 '23 07:08 jh-RLI

@wingechr I'm trying to figure out the best way to display the schemas/topics. Maybe we shouldn't show the incomplete data to everyone, but only to the creator and members of an assigned authorisation group. I think what I describe below is also what you have in mind?

When a user uploads data, it is in the physical model_draft schema in the database.

  • Question: Do we really want to display this data all the time? Perhaps it should only be publicly visible when metadata is available and open peer review has been requested? Otherwise we could show data without licence information. We could potentially be sued for publishing this data. In this way, the user could also decide when the data is ready for review.

All published data is stored in the dataset schema in the database and assigned to a topic.

  • All tables in datasets must have a topic.
  • We do not want to show a topic called "datasets" but the topic names and the tables related to that topic
  • The user can use a profile page (perhaps a more visible page) to manage all the user's data. The user can also publish data and select a topic. This is only possible if the open peer review process has been successfully completed.

jh-RLI avatar Jan 12 '24 12:01 jh-RLI

Btw. the publish functionality is already implemented. Once your table was reviewed the user sees a publish button when is visits the profile page.

jh-RLI avatar Jan 12 '24 12:01 jh-RLI