autoware.universe
autoware.universe copied to clipboard
Can't launch lidar_centerpoint with error: could not load library: libcenterpoint_cuda_libraries.so.
Checklist
- [X] I've read the contribution guidelines.
- [X] I've searched other issues and no duplicate issues were found.
- [X] I'm convinced that this is not my fault but a bug.
Description
I use a custom docker image and want to try using ros2 lidar_centerpoint.
It has been successfully established so far, but when using ros2 launch, the following error occurs.
Expected behavior
Hope to use point cloud data for object recognition.
Actual behavior
Make sure that the docker command has read the gpu. The instructions are as follows:
docker run --user $(id -u):$(id -g) --rm -it --gpus all -e DISPLAY -e TERM -e QT_X11_NO_MITSHM=1 -e XAUTHORITY=/tmp/.dockerjwzszsxi.xauth -v /tmp/.dockerjwzszsxi.xauth:/tmp/.dockerjwzszsxi.xauth -v /tmp/.X11-unix:/tmp/.X11-unix -v /etc/localtime:/etc/localtime:ro autoware-universe-ros2-export:v1 /bin/bash
Steps to reproduce
Nvidia-smi
Versions
Docker Link: https://drive.google.com/file/d/1sqpyMhZeLFWqCIbepB8lhlv8140bbxac/view?usp=sharing OS: Ubuntu 20.04 ROS2 Galatic
Possible causes
No response
Additional context
No response
I'll check with your docker image.
@mikechan0731 I successfully launch lidar_centerpoint in my environment.
docker run --user $(id -u):$(id -g) --rm -it --gpus all -e DISPLAY -e TERM -e QT_X11_NO_MITSHM=1 -v /tmp/.X11-unix:/tmp/.X11-unix -v /etc/localtime:/etc/localtime:ro -v $PWD:/home/itri/autoware_workspace 81e1c4fb8f3e /bin/bash
Cloud you describe your CUDA environment in more detail?
And I had warnings below, but there was no warning when I built the package using the docker image of autoware.
Hi! @yukke42
Thank you so much for your test, it seems that there is no problem with this docker file, the problem is my GPU, I will try it with a different computer.
My GPU itself is nvidia RTX2080
Local environment driver = 440 , cuda = 10.2 (as shown)
The environment enabled in docker has the same driver, cuda=11.4, I don't know if this will be a problem.
Thanks again for your assistance!!
@mikechan0731
Local environment driver = 440 , cuda = 10.2 (as shown)
This error might be caused by the version mismatch that the local cuda driver doesn't support cuda 11.1.
Hi, I test new env with nvidia driver 470. lidar_centerpoint is built without error, but it still show msg when I try to launch it:
I use colcon build --continue-on-error
and 150 package is built.
I am not sure how this happen and I am really want to test the performance.
Here is the code link I built (part of autoware.universe + part of autoware.common, 2.8G) https://1drv.ms/u/s!AnJ4ubRnmXsujIQxTIZucL7Ump9w7A?e=neq4JP
Thanks!
Can you try to rebuild the lidar_centerpoint package by removing the built targets in /build and /install, and also to make sure that the version of CUDA version in docker is same with local system.
@mikechan0731 do you have any updates?
@mikechan0731 has tried rebuilding the stack and still had the issue. However, he decided to use the default Autoware docker to avoid the issue. We will close this issue until someone else faces similar issue with his/her custom docker image.
@mitsudome-r I have the same issue with custom environment (Nvidia NGC TensorRT container with tensorrt 8.4.1-1+cuda11.6).
I fixed the issue by adding the following to CMakeLists.txt
install(
TARGETS
centerpoint_cuda_lib
)
You can find the commit here.
It turns out libcenterpoint_cuda_libraries.so
is built in the build/lidar_centerpoint folder but never installed to the install/lidar_centerpoint folder. I am curious as to why this woud work on some version of CUDA and tensorrt as it seems to be CMake issue. tensorrt_yolo
for example has a similar line that installs its built CUDA library:
https://github.com/autowarefoundation/autoware.universe/blob/main/perception/tensorrt_yolo/CMakeLists.txt#L192
I suggest reopen this issue and see if there are other reasons causing this. Otherwise the example commit above would be the fix.
close this issue since this error is fixed in the PR https://github.com/autowarefoundation/autoware.universe/pull/1916.