omnipose
omnipose copied to clipboard
FileNotFound Error while training a new model
Hi all, I am wondering if anyone has encountered this error: FileNotFoundError: The directory '/home/kcutler/DataDrive/debug' does not exist.
I am trying to train a new omnipose model (linux terminal - GPU). In omnipose venv, the command is: python -m omnipose --train --use_gpu --dir /home/thaocao/omnipose_training_images1/ --img_filter _img --mask_filter _masks --n_epochs 4000 --pretrained_model bact_phase_omni_2 --learning_rate 0.1 --diameter 0 --batch_size 16 --RAdam --all_channels --nclasses 2 --tyx 128,128 --save_each I see that a folder 'models' was created within my given directory of training images, but nothing was saved. The images are single channel .tif in 8-bit. The masks are binary .tif in 8-bit with foreground =1 and background=0.
If anyone has any thoughts, I'd be hugely grateful! Thanks
I got the same error when trying to train. The problem comes from omnipose/core.py, line 2360, it looks like there is a debug save not commented out. If you have the code you can comment that line out. Although then, there will be a ValueError raised because the data didn't look as omnipose expected it to.
2023-09-28 20:17:56,101 [CRITICAL] Sparse or over-dense image detected. Problematic index is: 4636. Image shape is: (2, 300, 300). tyx is: (224, 224). rescale is 1.0 2023-09-28 20:17:57,345 [CRITICAL] Sparse or over-dense image detected. Problematic index is: 538. Image shape is: (2, 520, 696). tyx is: (224, 224). rescale is 1.0 Original Traceback (most recent call last): File "lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index] File "lib/python3.10/site-packages/omnipose/data.py", line 292, in getitem imgi[i], labels[i], scale[i] = random_crop_warp(img=self.data[idx], File "lib/python3.10/site-packages/omnipose/core.py", line 2422, in random_crop_warp return random_crop_warp(img, Y, tyx, v1, v2, nchan, rescale, scale_range, File "lib/python3.10/site-packages/omnipose/core.py", line 2422, in random_crop_warp return random_crop_warp(img, Y, tyx, v1, v2, nchan, rescale, scale_range, File "lib/python3.10/site-packages/omnipose/core.py", line 2422, in random_crop_warp return random_crop_warp(img, Y, tyx, v1, v2, nchan, rescale, scale_range, [Previous line repeated 98 more times] File "lib/python3.10/site-packages/omnipose/core.py", line 2360, in random_crop_warp skimage.io.imsave('/home/kcutler/DataDrive/debug/img'+str(depth)+'.png',img[0])** File "lib/python3.10/site-packages/skimage/io/_io.py", line 143, in imsave return call_plugin('imsave', fname, arr, plugin=plugin, **plugin_args) File "lib/python3.10/site-packages/skimage/io/manage_plugins.py", line 207, in call_plugin return func(*args, **kwargs) File "lib/python3.10/site-packages/imageio/v2.py", line 263, in imwrite with imopen(uri, "wi", **imopen_args) as file: File "lib/python3.10/site-packages/imageio/core/imopen.py", line 113, in imopen request = Request(uri, io_mode, format_hint=format_hint, extension=extension) File "lib/python3.10/site-packages/imageio/core/request.py", line 247, in init self._parse_uri(uri) File "lib/python3.10/site-packages/imageio/core/request.py", line 412, in _parse_uri raise FileNotFoundError("The directory %r does not exist" % dn) FileNotFoundError: The directory '/home/kcutler/DataDrive/debug' does not exist
Sorry I didn't see this. @Jolanda5 @ThaoCao, this error comes up if your images are too sparse or completely full of cell pixels, indicating that your training data is not well-formatted. I should replace this with an alternate directory or just kill the process, but I suppose this accidentally accomplishes the latter. In the meantime, you can edit this save command so that you can see which image is giving you the error.
My solution was to comment the data checking and recursion out because we want our training to include negative GT (GT where the network is not supposed to detect anything).