EmbodiedScan icon indicating copy to clipboard operation
EmbodiedScan copied to clipboard

[Bug] Error occurs during converting scanNet

Open Mintinson opened this issue 1 year ago • 4 comments

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

System environment: sys.platform: linux Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0] CUDA available: True MUSA available: False numpy_random_seed: 287746113 GPU 0: NVIDIA A100-PCIE-40GB CUDA_HOME: /usr/local/cuda-11.3 NVCC: Cuda compilation tools, release 11.3, V11.3.58 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.11.0 PyTorch compiling details: PyTorch built with:

  • GCC 7.3

  • C++ Version: 201402

  • Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications

  • Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)

  • OpenMP 201511 (a.k.a. OpenMP 4.5)

  • LAPACK is enabled (usually provided by MKL)

  • NNPACK is enabled

  • CPU capability usage: AVX2

  • CUDA Runtime 11.3

  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37

  • CuDNN 8.2

  • Magma 2.5.2

  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

    TorchVision: 0.12.0 OpenCV: 4.10.0 MMEngine: 0.10.5

Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: 287746113 Distributed launcher: none Distributed training: False GPU number: 1

Reproduces the problem - code sample

python embodiedscan/converter/generate_image_scannet.py --dataset_folder data/scannet/ --fast  

Reproduces the problem - command or script

python embodiedscan/converter/generate_image_scannet.py --dataset_folder data/scannet/ --fast  

Reproduces the problem - error message

Traceback (most recent call last):
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "embodiedscan/converter/generate_image_scannet.py", line 176, in process_scene
    data = SensorData(os.path.join(path, idx, f'{idx}.sens'), fast)
  File "embodiedscan/converter/generate_image_scannet.py", line 62, in __init__
    self.load(filename, fast)
  File "embodiedscan/converter/generate_image_scannet.py", line 104, in load
    frame.load(f)
  File "embodiedscan/converter/generate_image_scannet.py", line 37, in load
    struct.unpack('c' * self.color_size_bytes,
struct.error: unpack requires a buffer of 265617 bytes
"""

Additional information

I can train fine, but when I run the script for test.py, it reports the error FileNotFoundError: [Errno 2] No such file or directory: 'data/scannet/posed_images/scene0568_00/00000.jpg. . After checking, I found that there was no such image in that path (it seems that not all of them were converted during my first convert), so I re-ran the generate_image_scannet.py script. But it fails, with the error as described above. (Oddly enough, it works at my first run) I added the following two lines to the source code:

                try.
                    frame.load(f)
                except Exception as e.
                  print(f)

which prints the following.

<_io.BufferedReader name='scans/scene0548_00/scene0548_00.sens'>
unpack requires a buffer of 64 bytes

It seems to be a buffer shortage? I would like to know how to solve this problem. Also, since I have already converted most of the scannet data, is there any way to just specify the converted files to save time.

Thnak you for your intime help.

Mintinson avatar Sep 28 '24 00:09 Mintinson

I have found the solution. It seems to be because there is corruption in the specified data, I re-downloaded the specified scene and this time it works. But when I execute test.py, it still doesn't go through and reports the following error:

Traceback (most recent call last):
  File "tools/test.py", line 157, in <module>
    main()
  File "tools/test.py", line 153, in main
    runner.test()
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1823, in test
    metrics = self.test_loop.run()  # type: ignore
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/loops.py", line 463, in run
    self.run_iter(idx, data_batch)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/runner/loops.py", line 487, in run_iter
    outputs = self.runner.model.test_step(data_batch)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 145, in test_step
    return self._run_forward(data, mode='predict')  # type: ignore
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward
    results = self(**data, mode=mode)
  File "/root/miniconda3/envs/embodiedscan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 681, in forward
    return self.predict(inputs, data_samples, **kwargs)
  File "/root/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 538, in predict
    positive_maps = self.get_positive_map(tokenized, tokens_positive)
  File "/root/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 632, in get_positive_map
    positive_map = self.create_positive_map(tokenized, tp, idx)
  File "/root/EmbodiedScan/embodiedscan/models/detectors/sparse_featfusion_grounder.py", line 595, in create_positive_map
    for (beg, end) in tok_list:
TypeError: cannot unpack non-iterable int object

Mintinson avatar Sep 28 '24 06:09 Mintinson

In the grounder decoder file, change the following lines in predict function in sparse_featfusion_grounder L532-L533

to the following code

  tokens_positive = [[[[0, 1]]]
                     for _ in range(len(batch_data_samples))]

henryzhengr avatar Sep 28 '24 07:09 henryzhengr

Thanks! It works!

Mintinson avatar Sep 28 '24 07:09 Mintinson

Thank you a lot for pointing out this bug, we'll fix it in the next update!

mxh1999 avatar Sep 29 '24 08:09 mxh1999