Training argument --save_each not working using Cellpose 3
Hi,
in Cellpose 2 you could use the training argument --save_every xx and --save_each to automatically save after a given number of epochs, which worked very nicely. Unfortunately, the last argument --save_each no longer works in Cellpose 3 (error: main.py: error: unrecognised arguments: --save_each). If I omit the argument and only use --save_every xx, then just one model is saved at the end.
Any idea how to solve the problem?
Thanks & best, Mario
The problem now with 3.0 is that it uses the same model name to save regardless of epoch iteration number, iepoch in the code. While we wait for an update on that I patched lib/pythonX.XX/site-packages/cellpose/train.py under the cellpose installation directory to save current model using filename 'model_path_save' as below
if iepoch > 0 and iepoch % save_every == 0:
model_path_save = f"{model_path}_{iepoch}"
net.save_model(model_path_save)
This will append the epoch number to the output name provided by either --model_name_out or the default one.
Below is the patch file (somehow I cannot attach it even though .patch is an acceptable extension):
--- train.py 2024-05-01 23:46:07.605886495 -0700
+++ /home/XXXX/train.py 2024-05-02 17:28:24.161011936 -0700
@@ -475,7 +475,8 @@
lavg, nsum = 0, 0
if iepoch > 0 and iepoch % save_every == 0:
- net.save_model(model_path)
+ model_path_save = f"{model_path}_{iepoch}"
+ net.save_model(model_path_save)
net.save_model(model_path)
return model_path
Great, thank you very much!!
this is fixed now, thanks for reporting it
Thank you very much!!