Add Native 3-D Volume Training Mode and 3-D + T mask stitching to Cellpose-SAM
Cellpose-SAM excels at 3-D inference, but training still expects 2-D slices.
This forces users to:
- pre-slice every z-stack, losing volumetric anisotropic sampling context for learning model flow embeddings
- manage pixel anisotropic sampling by hand
- write extra stitching code after fine-tuning a model and slows model training iterations
Describe the solution you’d like
- A
--train_3dflag (CLI + GUI) that:- Accepts (Z, Y, X) or (C, Z, Y, X) volumes + masks
- Uses a 3-D variant of the Cellpose head (3-D flow + distance outputs)
- Handles voxel anisotropy internally
- Supports patch-based sampling & mixed-precision for GPU efficiency on consumer hw such as 4090 or 5090
3-D + T mask stitching for time-lapse tracking
Cellpose already exposes a rudimentary stitching switch (stitch_threshold > 0) that merges spatially-adjacent masks across tiles. The new --train_3d workflow should extend that idea to temporal stitching so a single instance ID can be followed through successive volumes (Z, Y, X, T):
-
Input layout –‐ stacks shaped
(T, Z, Y, X)plus matching(T, Z, Y, X)label masks. -
Usage –‐
python -m cellpose ... --train_3d --stitch_3dt --stitch_threshold 0.7--stitch_3dtactivates both spatial and temporal stitching.--stitch_threshold(0–1) is the IoU cutoff used to merge a mask in frame t with the nearest mask in frame t + 1.
-
Algorithm –‐ in every time step, compute pairwise 3-D IoU between masks at t and masks at t + 1.
- Build a bipartite graph and solve a max-IoU assignment (Hungarian or greedy) subject to the threshold.
- Propagate the parent ID forward; create a new ID if no match is found.
- Optionally fill gaps ≤ gap_max frames with linear interpolation of centroids.
-
Outputs –‐ saves a
(Z, Y, X, T)mask stack where each nucleus keeps the same label across time, plus a.csvtrack table:track_id first_frame last_frame mean_volume … -
Why integrate now –‐ scientists training 3-D models usually work with time-lapse data (e.g. spheroid growth, embryo development). Training and tracking in the same UI lowers friction and guarantees mask compatibility.