Pytorch-UNet Training on mac mps instead of cuda

I'm using Apple's Metal Performace Shader's (MPS) as GPU backend, but, as I still have some warnings, I would like confirm whether not using PyTorch automatic mixed precision has significant implications on model training. Are there some benchmark training statistics available?

Using default configurations I have the following results for my first batches:

INFO: Starting training:
        Epochs:          5
        Batch size:      1
        Learning rate:   1e-05
        Training size:   4580
        Validation size: 508
        Checkpoints:     True
        Device:          mps
        Images scaling:  0.5
        Mixed Precision: False
Epoch 1/5:   0%|                          | 0/4580 [00:00<?, ?img/s]/Users/calkoen/miniconda3/envs/torch/lib/python3.10/site-packages/torch/amp/autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
Epoch 1/5:   9%| | 432/4580 [06:56<1:06:37,  1.04img/s, loss (batch)
Epoch 1/5:  20%|▏| 916/4580 [16:25<59:22,  1.03img/s, loss (batch)=1
Epoch 1/5:  10%| | 460/4580 [09:06<25:52:14, 22.61s/img, loss (batch
Epoch 1/5:  22%|▏| 1002/4580 [19:51<1:10:56,  1.19s/img, loss (batch
Epoch 1/5:  20%|▏| 918/4580 [18:10<22:55:57, 22.54s/img, loss (batch
INFO: Saved interrupt
Traceback (most recent call last):
  File "/Users/calkoen/dev/Pytorch-UNet/train.py", line 265, in <module>
    train_net(
  File "/Users/calkoen/dev/Pytorch-UNet/train.py", line 124, in train_net
    grad_scaler.scale(loss).backward()
  File "/Users/calkoen/miniconda3/envs/torch/lib/python3.10/site-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/Users/calkoen/miniconda3/envs/torch/lib/python3.10/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
KeyboardInterrupt

Durin this GPU utilization and memory allocation was around 70-100% and 50-80% respectively.

Some additional info below.

I'm setting the device with:

    device = torch.device("mps" if torch.backends.mps.is_available() else "cpu") 
    print(device) # device(type='mps')

I don't think mixed precision optimizations (amp) exist for MPS, so I train with amp=False.

However, I still got this cuda-related warning:

/Users/calkoen/miniconda3/envs/torch/lib/python3.10/site-packages/torch/amp/autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')

Which comes from this context:

with torch.cuda.amp.autocast(enabled=amp):
      masks_pred = net(images)
      loss = criterion(masks_pred, true_masks) + dice_loss(
          F.softmax(masks_pred, dim=1).float(),
          F.one_hot(true_masks, net.n_classes)
          .permute(0, 3, 1, 2)
          .float(),
          multiclass=True,
      )

# just to be sure...
print(amp)  # False
    
# the warning can be reproduced by running: 
torch.cuda.amp.autocast()   # or torch.cuda.amp.autocast(enabled=False)

This actually makes sense as autocast has the device hard coded to "cuda".

class autocast(torch.amp.autocast_mode.autocast):

def __init__(self, enabled : bool = True, dtype : torch.dtype = torch.float16, cache_enabled : bool = True):
    if torch._jit_internal.is_scripting():
        self._enabled = enabled
        self.device = "cuda"
        self.fast_dtype = dtype
        return
    super().__init__("cuda", enabled=enabled, dtype=dtype, cache_enabled=cache_enabled)

Oct 04 '22 09:10 FlorisCalkoen

Hi, can you try the latest master? I've added a check for MPS device in the autocast. But since autocast only supports CPU and CUDA, you should still turn AMP off.

Dec 06 '22 19:12 milesial

@milesial, great thanks. I'm currently out of office, but will check it asap.

Dec 07 '22 00:12 FlorisCalkoen