dolma icon indicating copy to clipboard operation
dolma copied to clipboard

Resume continue the pipeline

Open wannaphong opened this issue 10 months ago • 1 comments

Hello! I used Dolma pipeline with Slurm Workload Manager and it has timelimit

Is it possible to add resume continue the pipeline without rerun all pipeline?

wannaphong avatar Feb 26 '25 04:02 wannaphong

Have a look at the ignore_existing and metadata_prefix options, they should work for tagging, converting... You can set both from the config file.

When ignore_existing: false, the processor will look inside metadata directory whether a file has already been processed and will skip it. You should make sure to set metadata_prefix to a fixed path from inside your config file, otherwise it's set to a different temporary directory each time your script runs (thus ignore_existing will not have any effect).

SimonSuster avatar Apr 30 '25 08:04 SimonSuster

Hi! Thanks for the question. We’re currently working on closing out old tickets, and we apologize that we didn’t get to you in a timely fashion. We’re closing this out for now, but if you’d still like an answer, please re-open and we will get back to you!

baileykuehl avatar Jul 02 '25 19:07 baileykuehl