mmdetection
mmdetection copied to clipboard
Please, make the package PyTorch Nightly compatible for M1 Macs
Allowing training on MSP device besides CUDA and CPU.
Motivation Since the introduction of PyTorch Nightly, it is now possible to use tell PyTorch to use the embedded GPU of the new M1 chips on Mac. This allows both training and inference to be completed much faster (3x-5x faster, accordingly to my tests)
Related resources https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ https://pytorch.org/docs/master/notes/mps.html
Additional context I tried to initiate training with mmdetection on a conda environment with PyTorch Nightly installed on a M1 Max Mac, and I got the following error:
Traceback (most recent call last):
File "tools/train.py", line 133, in <module>
main()
File "tools/train.py", line 129, in main
runner.train()
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1706, in train
model = self.train_loop.run() # type: ignore
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
outputs = self.runner.model.train_step(
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 326, in _run_forward
results = self(**data, mode=mode)
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/detectors/base.py", line 92, in forward
return self.loss(inputs, data_samples)
File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/detectors/two_stage.py", line 174, in loss
rpn_losses, rpn_results_list = self.rpn_head.loss_and_predict(
File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/base_dense_head.py", line 167, in loss_and_predict
predictions = self.predict_by_feat(
File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/base_dense_head.py", line 279, in predict_by_feat
results = self._predict_by_feat_single(
File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/rpn_head.py", line 233, in _predict_by_feat_single
return self._bbox_post_process(
File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/rpn_head.py", line 284, in _bbox_post_process
det_bboxes, keep_idxs = batched_nms(bboxes, results.scores,
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmcv/ops/nms.py", line 302, in batched_nms
dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_)
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/utils/misc.py", line 354, in new_func
output = old_func(*args, **kwargs)
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmcv/ops/nms.py", line 127, in nms
inds = NMSop.apply(boxes, scores, iou_threshold, offset, score_threshold,
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmcv/ops/nms.py", line 27, in forward
inds = ext_module.nms(
RuntimeError: nms_impl: implementation for device mps:0 not found.
It is clear that mmdetection isn't compatible with Nightly and the availability of mps as alternative device besides CUDA and CPU.