RegTR icon indicating copy to clipboard operation
RegTR copied to clipboard

Train BUG, please help me

Open SeanSiyang opened this issue 2 years ago • 3 comments

When I execute the following command: python train.py --config conf/modelnet.yaml I got a Bug:


Traceback (most recent call last):
  File "train.py", line 85, in <module>
    main()
  File "train.py", line 81, in main
    trainer.fit(model, train_loader, val_loader)
  File "/home/zsy/Code/RegTR-main/src/trainer.py", line 79, in fit
    self._run_validation(model, val_loader, step=global_step,
  File "/home/zsy/Code/RegTR-main/src/trainer.py", line 249, in _run_validation
    val_out = model.validation_step(val_batch, val_batch_idx)
  File "/home/zsy/Code/RegTR-main/src/models/generic_reg_model.py", line 83, in validation_step
    pred = self.forward(batch)
  File "/home/zsy/Code/RegTR-main/src/models/regtr.py", line 117, in forward
    kpconv_meta = self.preprocessor(batch['src_xyz'] + batch['tgt_xyz'])
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/zsy/Code/RegTR-main/src/models/backbone_kpconv/kpconv.py", line 489, in forward
    pool_p, pool_b = batch_grid_subsampling_kpconv_gpu(
  File "/home/zsy/Code/RegTR-main/src/models/backbone_kpconv/kpconv.py", line 232, in batch_grid_subsampling_kpconv_gpu
    sparse_tensor = ME.SparseTensor(
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiSparseTensor.py", line 275, in __init__
    coordinates, features, coordinate_map_key = self.initialize_coordinates(
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiSparseTensor.py", line 338, in initialize_coordinates
    features = spmm_avg.apply(self.inverse_mapping, cols, size, features)
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/sparse_matrix_functions.py", line 183, in forward
    result, COO, vals = spmm_average(
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/sparse_matrix_functions.py", line 93, in spmm_average
    result, COO, vals = MEB.coo_spmm_average_int32(
RuntimeError: CUSPARSE_STATUS_INVALID_VALUE at /tmp/pip-req-build-h0w4jzhp/src/spmm.cu:591

My environment is configured as required. I think the problem might be with the code below:

        features=points,
        coordinates=coord_batched,
        quantization_mode=ME.SparseTensorQuantizationMode.UNWEIGHTED_AVERAGE
    )

I can't solve it , please help me, thx

SeanSiyang avatar Aug 01 '22 07:08 SeanSiyang

This is related to #1. Unfortunately I'm not able to replicate the problem on my machine. You might find it useful to install MinkowskiEngine using the commands listed in this post. If that doesn't work, you can modify the code to use the CPU preprocessing codes.

yewzijian avatar Aug 01 '22 10:08 yewzijian

This is related to #1. Unfortunately I'm not able to replicate the problem on my machine. You might find it useful to install MinkowskiEngine using the commands listed in this post. If that doesn't work, you can modify the code to use the CPU preprocessing codes.

Thx, if I can fix this, I'll share it in time

SeanSiyang avatar Aug 01 '22 10:08 SeanSiyang

I remove the final argument: quantization_mode=ME.SparseTensorQuantizationMode.UNWEIGHTED_AVERAGE. It works.

xqZhang-Strong avatar Oct 20 '22 10:10 xqZhang-Strong