Tony Hutter

Results 241 comments of Tony Hutter

Regarding: - Need to be able to cancel transfer of files on restart. Requires tracking set of transfer handles used (with redundancy to account for failures). Ensure everything was cancelled...

Somewhat related: https://github.com/LLNL/scr/issues/209

So as I understand it: 1. Job is writing checkpoints to node-local cache (like /tmp or /ssd). It hasn't flushed a checkpoint to the PFS yet. 2. Job dies 3....

I'm starting to implement this along with https://github.com/ECP-VeloC/AXL/issues/66. One observation: The new `AXL_Cleanup(char *path)` function I proposed would do a couple of things: 1. Remove old, unsuccessfully transferred files. 2....

Copying my comment from https://github.com/ECP-VeloC/AXL/issues/75#issuecomment-693626517: One idea is to have the poststage script call `axl_cp -U ` to "finalize" the transfers (mark them as done, rename the files to final...

Copying my comment from https://github.com/ECP-VeloC/AXL/issues/75#issuecomment-696956530 The more I think about this, the more I like this setup: 1. The post-stage script for AXL-only users would be `axl_cp -U `. This...

> The post-stage script for SCR users would be a new scr_finalize command we'd add to SCR. We could also call it `scr_poststage`

Just speaking to `scr_poststage` implementation details: The best case scenario is for the user to call `scr_poststage` (provided by SCR) and have it handle everything. That is, the user wouldn't...

@adammoody thanks, having a `$SCR_PREFIX/.scr/scr.dataset.X/state_file` seems the way to go then. I'm trying to put together a test that will exercise all the moving parts...

Quick update - I'm currently able to do a checkpoint, cancel the transfer midway though, finish the transfer, and then manually create the summary file and add it to the...