tutorials
tutorials copied to clipboard
[BUG] - The test error is consistently equal to 0 when I execute the Quickstart tutorial on the M2 ProMax.
Add Link
beginner_source/basics/quickstart_tutorial.py
Describe the bug
When executing the Quickstart tutorial on the M2 ProMax, the test error consistently remains at 0.
The code is exactly the same as this text, nothing has been modified.
https://github.com/pytorch/tutorials/blob/main/beginner_source/basics/quickstart_tutorial.py
Error message:
Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64
Using mps device
NeuralNetwork(
(flatten): Flatten(start_dim=1, end_dim=-1)
(linear_relu_stack): Sequential(
(0): Linear(in_features=784, out_features=512, bias=True)
(1): ReLU()
(2): Linear(in_features=512, out_features=512, bias=True)
(3): ReLU()
(4): Linear(in_features=512, out_features=10, bias=True)
)
)
Epoch 1
-------------------------------
loss: 2.311726 [ 64/60000]
loss: 2.294588 [ 6464/60000]
loss: 2.276271 [12864/60000]
loss: 2.260946 [19264/60000]
loss: 2.237525 [25664/60000]
loss: 2.206861 [32064/60000]
loss: 2.221519 [38464/60000]
loss: 2.183903 [44864/60000]
loss: 2.185328 [51264/60000]
loss: 2.132372 [57664/60000]
Test Error:
Accuracy: 0.0%, Avg loss: 2.145295
Epoch 2
-------------------------------
loss: 2.162894 [ 64/60000]
loss: 2.153250 [ 6464/60000]
loss: 2.093136 [12864/60000]
loss: 2.102805 [19264/60000]
loss: 2.044294 [25664/60000]
loss: 1.984252 [32064/60000]
loss: 2.014595 [38464/60000]
loss: 1.927812 [44864/60000]
loss: 1.940063 [51264/60000]
loss: 1.844966 [57664/60000]
Test Error:
Accuracy: 0.0%, Avg loss: 1.863020
Epoch 3
-------------------------------
loss: 1.903860 [ 64/60000]
loss: 1.871587 [ 6464/60000]
loss: 1.749879 [12864/60000]
loss: 1.784963 [19264/60000]
loss: 1.671392 [25664/60000]
loss: 1.630241 [32064/60000]
loss: 1.649708 [38464/60000]
loss: 1.547975 [44864/60000]
loss: 1.578283 [51264/60000]
loss: 1.459987 [57664/60000]
Test Error:
Accuracy: 0.0%, Avg loss: 1.493550
Epoch 4
-------------------------------
loss: 1.567270 [ 64/60000]
loss: 1.531002 [ 6464/60000]
loss: 1.378248 [12864/60000]
loss: 1.449183 [19264/60000]
loss: 1.325287 [25664/60000]
loss: 1.331857 [32064/60000]
loss: 1.349332 [38464/60000]
loss: 1.269045 [44864/60000]
loss: 1.308438 [51264/60000]
loss: 1.205329 [57664/60000]
Test Error:
Accuracy: 0.0%, Avg loss: 1.238385
Epoch 5
-------------------------------
loss: 1.321056 [ 64/60000]
loss: 1.299582 [ 6464/60000]
loss: 1.131391 [12864/60000]
loss: 1.237661 [19264/60000]
loss: 1.105717 [25664/60000]
loss: 1.140301 [32064/60000]
loss: 1.168805 [38464/60000]
loss: 1.099758 [44864/60000]
loss: 1.142098 [51264/60000]
loss: 1.057198 [57664/60000]
Test Error:
Accuracy: 0.0%, Avg loss: 1.082316
Done!
Saved PyTorch Model State to model.pth
Traceback (most recent call last):
File "/*****/quick_start.py", line 430, in <module>
run_github()
File "/*****//quick_start.py", line 420, in run_github
predicted, actual = classes[pred[0].argmax(0)], classes[y]
IndexError: list index out of range
Process finished with exit code 1
Describe your environment
- MacOS
- torch version: 2.0.1
- no cuda, torch.device = 'mps'
cc @suraj813
When I forcefully changed the device to CPU, everything worked fine.
Interesting, @AlphaGJW can you try PyTorch-2.1? Though to be frank, I can not reproduce the failure using neither pytorch-2.0.1 nor pytorch-2.1.0 using my Mac M2 Pro (but not ProMax) running Sonoma. Perhaps that's an interesting info to include