cellpose icon indicating copy to clipboard operation
cellpose copied to clipboard

[INSTALL] Mac M1-M3 support Cellpose v3

Open erikgerdtsson opened this issue 1 year ago • 4 comments

I was not able to get the GPU to work for v3 (GUI) on my macbook pro M3 .

python -m cellpose --gpu_device mps --use_gpu

Following instructions on Cellpose documentation for M1 mac with reference to Peter Soboleski's branch (Cellpose v2) with GPU support which seems to rely on PyTorch. That seem to work better.

https://github.com/psobolewskiPhD/cellpose/tree/feature/add_MPS_device

conda create --name cellpose-dev python=3.9 -y conda activate cellpose-dev conda install napari git clone https://github.com/psobolewskiPhD/cellpose.git cd cellpose git fetch git switch feature/add_MPS_device conda install imagecodecs -y pip install -e.

python -m cellpose

Are there any plans to release a mac M1-M3 version of v3?

#668

Thanks for an amazing tool.

erikgerdtsson avatar Mar 06 '24 12:03 erikgerdtsson

I'm sorry, for some reason Apple has not made torch support for double computations, like Nvidia has. We need to test whether there is a loss in performance with single, like in @psobolewskiPhD version of the code, before we can merge in those changes. Thank you Peter for your efforts on this

carsen-stringer avatar Mar 06 '24 12:03 carsen-stringer

I see, thank you.

erikgerdtsson avatar Mar 06 '24 12:03 erikgerdtsson

I'll try to revisit that and make a PR if there is still something special about my branch.

psobolewskiPhD avatar Mar 06 '24 13:03 psobolewskiPhD

I would also be really keen for either an updated version from Peter or an integration into v3 in general!

I saw significant speed improvements using the GPU, but the "cyto3" model performs better on my cells. I tried to load the cyto3 model as a pretrained model into v2 (from Peters branch) but couldn't get that to work somehow.

Thank you all!

LauraBreimann avatar Apr 12 '24 04:04 LauraBreimann

@LauraBreimann, @erikgerdtsson I made a fork of Cellpose 3.0.10 which allows you to use the GPU or CPU with the GUI on Apple Silicon.

From what I've tested, it works well, both from the GUI and from a Python script. And it was fairly easy to set up, so if anyone can bring it cleanly into the main branch that would be great. Normally, what I've done in my fork shouldn't affect other configurations (see changes).

However, I couldn't get the training to work, unfortunately. It does use the GPU, there are no errors but the generated model doesn't find any cells, and when using the GUI it shows: [INFO] 0, train_loss=1.4174, test_loss=0.0000, LR=0.0000, time 0.60s [INFO] 5, train_loss=nan, test_loss=0.0000, LR=0.0556, time 1.25s [INFO] 10, train_loss=nan, test_loss=0.0000, LR=0.1000, time 1.76s and so on. And using a Python script: [INFO] 0, train_loss=nan, test_loss=0.0000, LR=0.0000, time 0.51s [INFO] 5, train_loss=nan, test_loss=0.0000, LR=0.0556, time 1.11s [INFO] 10, train_loss=nan, test_loss=0.0000, LR=0.1000, time 1.62s

Unfortunately, I can't figure out where the problem is coming from. If anyone has any ideas, that would be great.

OratHelm avatar Jul 19 '24 16:07 OratHelm

Thank you this looks great! Indeed torch mps now supports double so all the inference should work - just curious what the speed up is compared to the CPU?

Happy to merge this in the near future, even without training support but would be really nice to have that. It looks like one step in the network is perhaps not implemented for autograd with mps: https://github.com/pytorch/pytorch/issues?q=is%3Aissue+mps+autograd+nan. But we don't have any uncommon steps in the network so not sure which would fail

carsen-stringer avatar Jul 19 '24 16:07 carsen-stringer

For the following tests, I used an M2 with 8 CPU cores and 8 GPU cores, which I think is the smallest GPU proposed by Apple. On a few images, segmentation with cyto3 was 36% faster and denoise approx 7 times faster on the GPU compared to the CPU. And I think the results would be even more interesting for training...

OratHelm avatar Jul 19 '24 21:07 OratHelm

The training is working fine for me on an M3 with Python 3.11 and pytorch 2.4.0!!

carsen-stringer avatar Aug 23 '24 17:08 carsen-stringer

@OratHelm could you please open a pull request with your fork?

carsen-stringer avatar Aug 23 '24 17:08 carsen-stringer

It works for me too, with Python 3.9.19 and Pytorch 2.4.0! 🥳 I just opened the pull request (#1003)

OratHelm avatar Aug 23 '24 20:08 OratHelm

Amazing thanks!

carsen-stringer avatar Aug 23 '24 21:08 carsen-stringer