nmdc-schema
nmdc-schema copied to clipboard
Implement migrators for all Berkeley schema changes (meta-issue)
Below is a list (in the order they should be executed) of the migration PRs to account for the schema changes in Berkeley:
-
- Should occur BEFORE
OmicsProcessing
becomesDataGeneration
- Should occur BEFORE
-
- Should occur BEFORE
OmicsProcessing
becomesDataGeneration
.
- Should occur BEFORE
-
Migrator_from_X_to_PR129: change
metabolite_quantified
andhas_metabolite_quantification
slot names- Should occur BEFORE
MetabolomicsAnalysisActivity
becomesMetabolomicsAnalysis
- Should occur BEFORE
-
Migrator_from_X_to_PR31: Remove used slot from
WorkflowExecution
subclasses- Needs to occur BEFORE collection name changes as it refers to old collection names
- Neds to occur BEFORE the instrument_set migrator happens because this migrator compares the values in the 'used' slot to the values in the 'instrument_name' slot (that is removed in that migrator)
- Needs to occur BEFORE workflow chain migrator because that migrator removes the
was_informed_by
slot from theWorkflowExecution
s which this migrator references.
-
- Should occur BEFORE collection name changes as this migrator refers to
omics_processing_set
andWorkflowExecution
subclasses with "Activity" in the name. - Shoul occur AFTER
omics_type
becomesanalyte_category
- Should occur BEFORE collection name changes as this migrator refers to
-
Migrator_from_X_to_PR19_and_PR70:
instrument_set
update toinstrument_used
frominstrument_name
- Should occur BEFORE collection name changes as this migrator refers to
omics_processing_set
- Should occur BEFORE collection name changes as this migrator refers to
-
Migrator_from_X_to_PR2_and_PR24: change MongoDb collection names
- This includes changing
omics_processing_set
todata_generation_set
and the workflow execution set names
- This includes changing
-
Migrator_from_X_to_PR10: Add the
type
slot to every class instance (calls the new collection names.- This will need to happen AFTER the classes are renamed. E.g. this migration calls the
data_generation_set
instead of theomics_processing_set
) - This PR was also updated to account for new inlined classes: https://github.com/microbiomedata/berkeley-schema-fy24/pull/103
- This will need to happen AFTER the classes are renamed. E.g. this migration calls the
-
Migrator_from_X_to_PR3: Fix type slot to specify subclass for DataGeneration subclasses
- This needs to happen AFTER the following migrations:
- X_to_PR4 (changing
omics_type
toanalyte_category
and updating values to enum) - X_to_PR2_and_PR24 (changing collection names, esp.
omics_processing_set
todata_generation_set
) - X_to_PR10 (adding type slot to all instances as this migrator will just make all
type:DataGeneration
and not take into account the subclasses (e.g.type:NucleotideSequencing
, andtype:MassSpectrometry
)
- X_to_PR4 (changing
- This needs to happen AFTER the following migrations:
- ~Update collection names that changed (all
WorkflowExecution(Activity)
subclass collections andomics_processing_set
todata_generation_set
. (needs to be refactored to account for cross-collection/database level migrations)~- This PR was closed because Eric implemented adapters to make changes to the whole database.