dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

Globus patch - no need to lock the dataset while an upload is in progress, when tasks are monitored asynchronously.

Open landreev opened this issue 1 month ago • 6 comments

What this PR does / why we need it:

Globus uploads have been handled the same way as ingests - a dedicated lock is placed on the dataset for the duration preventing any further transfers or edits. This does not appear to be necessary when the asynchronous, database queue-based task monitoring mode is enabled. Since the whole point of Globus support, at HDV at least, is for handling larger/TB-sized data these transfers can take a long time (days potentially) and keeping the dataset locked further complicates an already cumbersome workflow.

Which issue(s) this PR closes:

There's no corresponding issue as of now, this started as a production patch.

  • Closes #

Special notes for your reviewer:

I still have no idea how to go about creating meaningful tests for any Globus-related functionality. Any feedback is welcome.

Suggestions on how to test this:

This can be tested on one of the instances where Globus storage is configured: demo and dataverse-internal. In a collection with a Globus storage volume assigned, starting a long-ish running Globus upload (will need to be something in at least 10s of MBs; and this is definitely a PR that will be easier to test from home, since transfers are obscenely fast between NESE and Harvard local networks). With this build, it should be possible to start another Globus transfer; the "add files" and all the other buttons except for "Publish Dataset" should stay enabled for the duration. A finer test will be to stack multiple simultaneous transfers, and confirm that a) the above is still true b) that the Publish button will stay disable for as long as at least one transfer is still active, but becomes enabled again once the last one finishes. Similarly, the message about publishing being disabled should disappear at the end. I suggest not to actually try and publish the dataset; simply because then you will be able to delete the draft, and have the files stored at NESE permanently erased in the process. Even though it's a tape volume dedicated to testing that demo and internal are configured to use, it's still prudent not to leave junk on it unnecessarily.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

landreev avatar Nov 12 '25 15:11 landreev