djl icon indicating copy to clipboard operation
djl copied to clipboard

Support for Radeon GPUs?

Open viggy96 opened this issue 3 months ago • 6 comments

Description

Support Radeon GPUs to be used for accelerating inferencing and training

Will this change the current api? How? No

Who will benefit from this enhancement? All users of Radeon GPUs

viggy96 avatar Apr 23 '24 06:04 viggy96

I've tried using the PyTorch ROCm version from here https://repo.radeon.com/rocm/manylinux/rocm-rel-6.0/README.html And it does work according to these validation instructions: https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/install-pytorch.html#verify-pytorch-installation

However I get the following error when running my project:

[pool-1-thread-1] WARN ai.djl.util.Platform - The bundled library: cu121-linux-x86_64:2.1.1-20231129 doesn't match system: cpu-linux-x86_64:2.1.1
[pool-1-thread-1] INFO ai.djl.util.Platform - Ignore mismatching platform from: jar:file:/home/vignesh/.gradle/caches/modules-2/files-2.1/ai.djl.pytorch/pytorch-native-cu121/2.1.1/fe8e6fa55e25294ae61c9832c029d5dddbd759aa/pytorch-native-cu121-2.1.1-linux-x86_64.jar!/native/lib/pytorch.properties
[pool-1-thread-1] INFO ai.djl.util.Platform - Found matching platform from: jar:file:/home/vignesh/.gradle/caches/modules-2/files-2.1/ai.djl.pytorch/pytorch-native-cpu/2.1.1/2625b85275629071b06b0f7f27822e03257dffa0/pytorch-native-cpu-2.1.1-linux-x86_64.jar!/native/lib/pytorch.properties
OpenJDK 64-Bit Server VM warning: You have loaded library /home/vignesh/.local/lib/python3.11/site-packages/torch/lib/libamdhip64.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
amdgpu.ids: No such file or directory
terminate called after throwing an instance of 'std::runtime_error'
  what():  Invalid ext op lib path

viggy96 avatar Apr 23 '24 08:04 viggy96

Has DJL ever used Radeon GPUs?

viggy96 avatar Apr 25 '24 05:04 viggy96

@viggy96

We don't support ROCm, you can try to build PyTorch JNI for ROCm by yourself. See: https://github.com/deepjavalibrary/djl/blob/master/engines/pytorch/pytorch-native/build.sh#L26

frankfliu avatar Apr 25 '24 14:04 frankfliu

@viggy96

You actually can use DJL with ROCm using OnnxRuntime engine, see: https://github.com/deepjavalibrary/djl/blob/master/engines/onnxruntime/onnxruntime-engine/src/main/java/ai/djl/onnxruntime/engine/OrtModel.java#L212-L213

frankfliu avatar Apr 26 '24 06:04 frankfliu

That sounds great, do you have any object detection inference examples using OnnxRuntime?

viggy96 avatar Apr 26 '24 06:04 viggy96

https://github.com/deepjavalibrary/djl/blob/master/examples/src/main/java/ai/djl/examples/inference/Yolov8Detection.java

frankfliu avatar Apr 26 '24 16:04 frankfliu