metacatui
metacatui copied to clipboard
Clean up DataPackage to address issues with submitting datasets
We've been encountering persistent issues with handling of dataset submission errors. The DataPackage collection is a central component of this process, but in the 7+ years since its creation, it's accumulated some issues that may be contributing to the problems we're seeing. Cleaning up the DataPackage will help us improve error handling and make it easier to add unit tests and resource maps validation.
Assessment of DataPackage:
- The
DataPackageis a Backbone Collection being used like a Model, which complicates tracking and responding to property changes. - Both
DataPackageandPackageModelhave methods that handle serialization of resource maps. - There's no validation of resource maps before they are serialized and saved. Since Metacat also does not validate resource maps, invalid resource maps can be saved without error, which leads to broken datasets (the "missing files" issue).
- Some errors during the save process are not caught or communicated to the user, which results in the endless spinner issue.
- The
rdflibdependency has not been updated in over 7 years - The
DataPackagecontains methods that are incomplete or unused (e.g.transferQueue) - The
DataPackagewas intended to replace the olderPackageModel, but this transition was never fully completed. Some functionality is still dependent onPackageModelwhich is still being used in some places.
Where DataPackage vs PackageModel are in use:
-
DataPackage(to cleanup)EML211EditorView
-
PackageModel(to deprecate)SearchResultViewDownloadButtonViewPackageTableView(deprecated)
-
Both
DataPackage&PackageModelDataPackageViewMetadataView
During the cleanup, we will:
- [ ] Add validation for resource maps before submission
- [ ] Improve error detection and handling during the save process
- [ ] Refactor
DataPackageas a Model containing a Collection of DataONEObject models (EML/ScienceMetadata, DataObjects, and nested DataPackages) - [ ] Separate system metadata into its own model
- [ ] Fully transition from
PackageModeltoDataPackage, deprecatePackageModel. - [x] Fix all linting errors and warnings.
- [ ] Update rdflib to the latest version.
- [ ] Make sure we don't set multiple listeners on the same event (remove listeners before re-adding them).
- [ ] Write unit tests for
DataPackage, at least for the most critical parts. - [ ] Simplify and modularize complex methods