pointnet2 icon indicating copy to clipboard operation
pointnet2 copied to clipboard

No OpKernel issue on FarthestPointSample

Open Kaiwind88 opened this issue 6 years ago • 9 comments

Hi Charlers, When I tried to run your train.py code, I got some error. It seems FarthestPointSample is running on CPU not GPU, but I did not change anything about your codes. I am wondering if you have any suggestion about it? I looked at the cuda code, but I can not find any issue (I am pretty new about CUDA). I listed the error below.

Thank you.

Traceback (most recent call last): File "train.py", line 284, in train() File "train.py", line 160, in train sess.run(init) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'FarthestPointSample' with these attrs. Registered devices: [CPU], Registered kernels: device='GPU'

 [[Node: layer1/FarthestPointSample = FarthestPointSample[npoint=512, _device="/device:GPU:0"](Placeholder)]]

Caused by op u'layer1/FarthestPointSample', defined at: File "train.py", line 284, in train() File "train.py", line 121, in train pred, end_points = MODEL.get_model(pointclouds_pl, is_training_pl, bn_decay=bn_decay) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/models/pointnet2_cls_ssg.py", line 32, in get_model l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=512, radius=0.2, nsample=32, mlp=[64,64,128], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1', use_nchw=True) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/utils/pointnet_util.py", line 113, in pointnet_sa_module new_xyz, new_points, idx, grouped_xyz = sample_and_group(npoint, radius, nsample, xyz, points, knn, use_xyz) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/utils/pointnet_util.py", line 40, in sample_and_group new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz)) # (batch_size, npoint, 3) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/tf_ops/sampling/tf_sampling.py", line 56, in farthest_point_sample return sampling_module.farthest_point_sample(inp, npoint) File "", line 46, in farthest_point_sample File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'FarthestPointSample' with these attrs. Registered devices: [CPU], Registered kernels: device='GPU'

 [[Node: layer1/FarthestPointSample = FarthestPointSample[npoint=512, _device="/device:GPU:0"](Placeholder)]]

Kaiwind88 avatar Apr 20 '18 17:04 Kaiwind88

I think the main reason for this error is that you are using a tensorflow-cpu installation. Reinstall using tensorflow-gpu instead

KamalM8 avatar May 10 '18 12:05 KamalM8

@Kaiwind88 Have you solved the problem? i have encountered the same problem

kingsvalley avatar May 14 '18 14:05 kingsvalley

I'm using a tenserflow-gpu installation but I'm still getting the same error.

merium avatar Jun 06 '18 13:06 merium

This is actually a GPU problem. Run the tensorflow GPU test to make sure the test is passed. https://www.tensorflow.org/programmers_guide/using_gpu

merium avatar Jun 07 '18 03:06 merium

@kingsvalley Yes, I just found the solution as mentioned by @merium and @KamalM8 . I have tensorflow-cpu installed by chance, so the code does not recognize GPU. I install the gpu version, and it works.

shadowind avatar Jun 07 '18 05:06 shadowind

conda install tensorflow-gpu

frostinassiky avatar Mar 04 '20 18:03 frostinassiky

Check whether tensorflow is able to access gpu via these command in command line import tensorflow as tf tf.test.tf.is_gpu_available() You should get True

then check tf.test.is_built_with_cuda() here also you should get True

If you get False than you will have to install tensorflow-gpu pip install tensorflow-gpu==XXXX XXXX is the version in case you need to install any specific version

DineshChandra94 avatar Mar 11 '20 09:03 DineshChandra94

Hi, You should do the following is checks:

  • cuda version was used during compiling tensorflow-gpu. This link tells you the cuda version of pre-built tensorflow-gpu which you may get through pip.
  • cuda version you actually have in your system, which is normally at /usr/local/ If the two cuda versions are not the same. You are in bad luck, and it causes the error you see, which means that you install the right tensorflow with gpu support but the actual driver is not the right one to run.

Cheers.

towardthesea avatar Jul 13 '20 09:07 towardthesea

If you have a single GPU, just set os.environ['CUDA_VISIBLE_DEVICES'] in main.py file to str(0)

os.environ['CUDA_VISIBLE_DEVICES'] = str(0)

and make sure you have tensorflow-gpu installed.

caiobarrosv avatar Feb 03 '21 01:02 caiobarrosv