cellpose icon indicating copy to clipboard operation
cellpose copied to clipboard

Recommended Settings (nimg_per_tif) for 3D segmentation

Open postnubilaphoebus opened this issue 4 months ago • 5 comments

When using make_train.py to generate three different slice-views of input image volumes, which settings do you recommend for nimg_per_tif? For instance, say you have anisotropic volumes of shape 34 * 512 * 512 and set the anisotropic factor to 7, should you set nimg_per_tif to 34? Or perhaps 512?

Also, in the docs https://cellpose.readthedocs.io/en/latest/gui.html#using-the-gui it says "If you have 3D data, please save random XY, YZ and XZ slices through your 3D data, ideally sufficiently spaced from each other so the information each slice has is distinct. Then put these slices into a folder and start the human-in-the-loop training". Which spacing should I provide for my data to ensure optimal performance? Also, since you have to provide 2d slices for 3d training, is the training/validation split handled based on image number or slice number, and are there any naming conventions I should default to in order to avoid validation leakage?

Finally, can you explain your reasoning for the slice number generation? For example, for an image of shape 34 * 512 * 512, and supplying anisotropy factor of 7 as well as nimg_per_tif=512, I get: 34 XY slices of shape 512 * 512, 512 ZY slices of shape 238 * 512, and 512 ZX slices of shape 238 * 512. Is it intended to not generate more slices in z for XY planes even if you upsample? You could potentially use trilinear interpolation for upsampling and then not get duplicates in z, enabling a higher number of XY slices.

Thank you for your help!

postnubilaphoebus avatar Aug 13 '25 10:08 postnubilaphoebus

On a different note, why does make_train.py ignore mask files during partitioning?

postnubilaphoebus avatar Aug 13 '25 11:08 postnubilaphoebus

which settings do you recommend for nimg_per_tif?

The default 10 will likely work for you. This will split each dimension into 10 slices. If you have very high resolution data and a small volume you will need to decrease this parameter for fewer, less similar slices. And vice versa for lower resolution volumes. See the next question.

Which spacing should I provide for my data to ensure optimal performance?

We don't have data on this but my recommendation would be to use the farthest apart images first, since these have the least similar information. Then using the HITL feature train on the close images. How similar your images are will determine the best number of skipped planes (very high resolution will have more similar adjacent planes).

Finally, can you explain your reasoning for the slice number generation? [...] Is it intended to not generate more slices in z for XY planes even if you upsample?

The script doesn't add additional XY slices because these slices are isotropic (X resolution == Y resolution). Z is interpolated for the XZ and YZ slices so that the pixel resolution is kept isotropic for the network to interpret (because Z resolution =/= X or Y resolution).

On a different note, why does make_train.py ignore mask files during partitioning?

When the script was written, it was assumed that the user wouldn't have mask data.

mrariden avatar Aug 27 '25 21:08 mrariden

Hi @mrariden,

Thank you for your reply! I have now saved training slices for 3d training in a specified folder. These training images are greyscale, and make_train.py by default converts them to have 3 channels as per the convert_image function. For the train masks, I changed pm in the make_train.py file to pm = [(0, 1, 2), (2, 0, 1), (1, 0, 2)] and commented out convert_image to generate the mask slices. For a test run, I tried using the following script:

from cellpose import io, models, train
from cellpose.io import get_image_files
io.logger_setup()
import os

train_dir = "/home/laurids/Documents/cellpose/collected"
test_dir = "/home/laurids/Documents/cellpose/test"
# blah = get_image_files(train_dir, mask_filter="_masks")
# import pdb; pdb.set_trace()
output = io.load_train_test_data(train_dir, test_dir, image_filter="_img",
                                mask_filter="_masks", look_one_level_down=False)
images, labels, image_names, test_images, test_labels, image_names_test = output

model = models.CellposeModel(gpu=True)

model_path, train_losses, test_losses = train.train_seg(model.net,
                            train_data=images, train_labels=labels,
                            test_data=test_images, test_labels=test_labels,
                            weight_decay=0.1, learning_rate=1e-5,
                            n_epochs=100, model_name="my_new_model")

However, I run into an issue when training, see below:

2025-08-30 21:08:58,198 [INFO] WRITING LOG OUTPUT TO /home/laurids/.cellpose/run.log
2025-08-30 21:08:58,198 [INFO] 
cellpose version: 	3.0.7.dev3+g509ffca 
platform:       	linux 
python version: 	3.8.5 
torch version:  	2.4.1+cu121
folder /home/laurids/Documents/cellpose/collected
2025-08-30 21:08:58,203 [INFO] not all flows are present, running flow generation for all images
2025-08-30 21:08:58,274 [INFO] 60 / 60 images in /home/laurids/Documents/cellpose/collected folder have labels
folder /home/laurids/Documents/cellpose/test
2025-08-30 21:08:58,275 [INFO] not all flows are present, running flow generation for all images
2025-08-30 21:08:58,276 [INFO] reading tiff with 125 planes
100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 125/125 [00:00<00:00, 3380.76it/s]
2025-08-30 21:08:58,316 [INFO] reading tiff with 125 planes
100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 125/125 [00:00<00:00, 2988.76it/s]
2025-08-30 21:08:58,358 [INFO] 1 / 1 images in /home/laurids/Documents/cellpose/test folder have labels
2025-08-30 21:08:58,592 [INFO] ** TORCH CUDA version installed and working. **
2025-08-30 21:08:58,592 [INFO] >>>> using GPU
2025-08-30 21:08:58,710 [INFO] computing flows for labels
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [00:05<00:00, 10.81it/s]
2025-08-30 21:09:04,344 [INFO] flows precomputed
2025-08-30 21:09:04,359 [INFO] >>> computing diameters
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [00:00<00:00, 2554.65it/s]
  0%|                                                                                                                 | 0/1 [00:00<?, ?it/s]/home/laurids/anaconda3/envs/cellposenew/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
/home/laurids/anaconda3/envs/cellposenew/lib/python3.8/site-packages/numpy/core/_methods.py:192: RuntimeWarning: invalid value encountered in scalar divide
  ret = ret.dtype.type(ret / rcount)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 891.84it/s]
2025-08-30 21:09:04,385 [INFO] >>> normalizing {'lowhigh': None, 'percentile': None, 'normalize': True, 'norm3D': False, 'sharpen_radius': 0, 'smooth_radius': 0, 'tile_norm_blocksize': 0, 'tile_norm_smooth3D': 1, 'invert': False}
2025-08-30 21:09:08,312 [INFO] >>> n_epochs=100, n_train=60, n_test=1
2025-08-30 21:09:08,312 [INFO] >>> AdamW, learning_rate=0.00001, weight_decay=0.10000
2025-08-30 21:09:08,888 [INFO] >>> saving model to /home/laurids/Documents/cellpose/models/my_new_model
len(lbls) 8
len(imgs) 8
Traceback (most recent call last):
  File "train.py", line 19, in <module>
    model_path, train_losses, test_losses = train.train_seg(model.net,
  File "/home/laurids/anaconda3/envs/cellposenew/lib/python3.8/site-packages/cellpose/train.py", line 428, in train_seg
    imgi, lbl = transforms.random_rotate_and_resize(imgs, Y=lbls, rescale=rsc,
  File "/home/laurids/anaconda3/envs/cellposenew/lib/python3.8/site-packages/cellpose/transforms.py", line 868, in random_rotate_and_resize
    I = cv2.warpAffine(img[k], M, (xy[1], xy[0]), flags=cv2.INTER_LINEAR)
IndexError: index 125 is out of bounds for axis 0 with size 125

Do you have any idea what could cause the index error with cv2.warpAffine? I can provide the collected folder in which my training images reside if you wish, it is only 50mb.

postnubilaphoebus avatar Aug 30 '25 19:08 postnubilaphoebus

You will need to add connected components and label renumbering steps after slicing out your volumes so that the masks make sense for 2D.

From the error message it sounds like you may have incorrectly resized/reshaped the slices, but I'm not positive. You should double check that their shapes are correct.

mrariden avatar Oct 07 '25 21:10 mrariden

@postnubilaphoebus I encountered your cv2.warpAffine error with data containing only a single channel. Upon inspection, this was due to incorrect handling of the single-channel case during image preprocessing. I have opened am issue with a potential solution here: https://github.com/MouseLand/cellpose/issues/1338.

Hope that helps!

PandaGab avatar Oct 15 '25 17:10 PandaGab