mxnet icon indicating copy to clipboard operation
mxnet copied to clipboard

Segment fault when calculating data located at different GPU

Open rubbberrabbit opened this issue 3 years ago • 0 comments

Description

Segment fault when calculating data located at different GPU. Most of the time, manipulating data located on different devices will give clear exceptions to tell users to deep copy data to the same device before manipulating. But I found that simply adding two ndarray data at different GPU will just cause segment fault, which may indicate the add operator can not handle the condition well nor give a reasonable exception report.

Error Message

Segment Fault

To Reproduce

I run this script on MxNet1.9.1 with two RTX3090 GPU. from mxnet import np,npx import mxnet.gluon.nn as nn npx.set_np() X = np.ones((1, 10),ctx=npx.gpu(0)) Y = np.ones((1, 10),ctx=npx.gpu(1)) C = X + Y print(C)

Steps to reproduce

(Paste the commands you ran that produced the error.)

run the code script

What have you tried to solve it?

Environment

MxNet1.9.1 CUDA11.2 with two RTX3090 GPU.

rubbberrabbit avatar Aug 11 '22 08:08 rubbberrabbit