mmdetection icon indicating copy to clipboard operation
mmdetection copied to clipboard

Please, make the package PyTorch Nightly compatible for M1 Macs

Open fablau opened this issue 1 year ago • 11 comments

Allowing training on MSP device besides CUDA and CPU.

Motivation Since the introduction of PyTorch Nightly, it is now possible to use tell PyTorch to use the embedded GPU of the new M1 chips on Mac. This allows both training and inference to be completed much faster (3x-5x faster, accordingly to my tests)

Related resources https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ https://pytorch.org/docs/master/notes/mps.html

Additional context I tried to initiate training with mmdetection on a conda environment with PyTorch Nightly installed on a M1 Max Mac, and I got the following error:

Traceback (most recent call last):
  File "tools/train.py", line 133, in <module>
    main()
  File "tools/train.py", line 129, in main
    runner.train()
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1706, in train
    model = self.train_loop.run()  # type: ignore
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
    self.run_epoch()
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
    self.run_iter(idx, data_batch)
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
    outputs = self.runner.model.train_step(
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
    losses = self._run_forward(data, mode='loss')  # type: ignore
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 326, in _run_forward
    results = self(**data, mode=mode)
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/detectors/base.py", line 92, in forward
    return self.loss(inputs, data_samples)
  File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/detectors/two_stage.py", line 174, in loss
    rpn_losses, rpn_results_list = self.rpn_head.loss_and_predict(
  File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/base_dense_head.py", line 167, in loss_and_predict
    predictions = self.predict_by_feat(
  File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/base_dense_head.py", line 279, in predict_by_feat
    results = self._predict_by_feat_single(
  File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/rpn_head.py", line 233, in _predict_by_feat_single
    return self._bbox_post_process(
  File "/Users/xxxxxx/stuff/mmdetection/MacOS/mmdetection/mmdet/models/dense_heads/rpn_head.py", line 284, in _bbox_post_process
    det_bboxes, keep_idxs = batched_nms(bboxes, results.scores,
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmcv/ops/nms.py", line 302, in batched_nms
    dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg_)
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmengine/utils/misc.py", line 354, in new_func
    output = old_func(*args, **kwargs)
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmcv/ops/nms.py", line 127, in nms
    inds = NMSop.apply(boxes, scores, iou_threshold, offset, score_threshold,
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/Users/xxxxxx/anaconda3/envs/openmmlabNIGHTLY/lib/python3.8/site-packages/mmcv/ops/nms.py", line 27, in forward
    inds = ext_module.nms(
RuntimeError: nms_impl: implementation for device mps:0 not found.

It is clear that mmdetection isn't compatible with Nightly and the availability of mps as alternative device besides CUDA and CPU.

fablau avatar Apr 12 '23 23:04 fablau