DeepLabCut
DeepLabCut copied to clipboard
Feature request: saving analyzed video data at checkpoints
Is your feature request related to a problem? Please describe.
I'm always frustrated when: I run DLC dlc.analyze_videos
on HPC's, and the videos seem to have highly variable analysis times, causing some jobs to run for 5 hours but then fail, with nothing to show for it. Eg for a bunch of 1 hour videos, some finish in 2-3 hours, some finish in 10. (This might be due to slightly different camera angles between sessions, and those sessions with slightly weirder / more unusual angles are harder for the model to evaluate? Not sure. Separate issue.)
Describe the solution you'd like I would like to be able to say, hey, save a checkpoint every 100,000 frames. Then you can re-start in the middle without having to redo all that previous computation.
Describe alternatives you've considered
It looks like use_shelf
is an option that you've considered for this, but 1) it only is implemented for multi-animal, 2) it's memory inefficient because it holds the entire dataset in memory (if I'm skimming the docs right). Why not save a pickle with some suffix like PARTIAL or CHECKPOINT, check if that pickle exists before starting, and load it in? My (simplistic and maybe wrong) understanding is that DLC operates frame by frame, so restarting from the middle shouldn't effect outcomes.
Thanks! Not urgent but maybe for the roadmap.
and the videos seem to have highly variable analysis times, causing some jobs to run for 5 hours but then fail, with nothing to show for it. Eg for a bunch of 1 hour videos, some finish in 2-3 hours, some finish in 10. (This might be due to slightly different camera angles between sessions, and those sessions with slightly weirder / more unusual angles are harder for the model to evaluate? Not sure. Separate issue.)
This seems an issue with HPC actually; I never see different analysis runtimes for "more complex" data; we even benchmarked this across keypoints and models and a lot of hardware and see no such issues - see Warren & Mathis 2018, and Mathis 2021 also Kane et al. 2020
Note, these are older GPUs, so things only got faster: