djl
djl copied to clipboard
Failed to download libraries
Description
Some files are failing to download after updating pytorch-engine to 0.20.0, the files aren't on your cloud instances so DJL just throws an error
Expected Behavior
Downloads properly
Error Message
ai.djl.engine.EngineException: Cannot download jni files: https://publish.djl.ai/pytorch/1.9.1/jnilib/0.20.0/linux-x86_64/cu111/libdjl_torch.so
at ai.djl.pytorch.jni.LibUtils.downloadJniLib(LibUtils.java:515)
at ai.djl.pytorch.jni.LibUtils.findJniLibrary(LibUtils.java:252)
at ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:80)
at ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:54)
at ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:40)
at ai.djl.engine.Engine.getEngine(Engine.java:186)
at ai.djl.engine.Engine.getInstance(Engine.java:141)
Caused by: java.io.FileNotFoundException: https://publish.djl.ai/pytorch/1.9.1/jnilib/0.20.0/linux-x86_64/cu111/libdjl_torch.so
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1993)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589)
at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:224)
at java.base/java.net.URL.openStream(URL.java:1161)
at ai.djl.util.Utils.openUrl(Utils.java:459)
at ai.djl.util.Utils.openUrl(Utils.java:443)
at ai.djl.pytorch.jni.LibUtils.downloadJniLib(LibUtils.java:509)
... 12 more
How to Reproduce?
I've changed a simple application from
implementation("ai.djl.pytorch:pytorch-engine:0.16.0")
implementation("ai.djl.pytorch:pytorch-native-auto:1.9.1")
to
implementation("ai.djl.pytorch:pytorch-engine:0.20.0")
implementation("ai.djl.pytorch:pytorch-native-auto:1.9.1")
Steps to reproduce
Just launch a simple program with this line to initiate the process to load the native libraries
Engine.getInstance()
What have you tried to solve it?
These are missing files on your servers I assume, so nothing can be really done other than rollback...
Environment Info
N/A
@waicool20
ai.djl.pytorch:pytorch-native-autois no longer needed, simply remove it will work- PyTorch 1.9.1 is not supported by 0.20.0, 0.20.0 support 1.11.0, 1.12.1 and 1.13.0, see: https://docs.djl.ai/master/engines/pytorch/pytorch-engine/index.html
Seems like that works, the GPU inference is fine, but when i force it to use cpu by adding to gradle:
implementation("ai.djl.pytorch:pytorch-native-cpu:1.13.0:linux-x86_64")
it hangs up with a very non-descript error:
Program aborted due to an unhandled Error:
Unable to find target for this triple (no targets are registered)
The error seems related to your jit traced model with PyTorch 1.13.0: https://discuss.pytorch.org/t/calling-forward-on-torchscript-model-multiple-times-leads-to-error/154990/3
Can you try PyTorch 1.12.1?
1.12.1 does not work, neither does 1.11.0
That link indicates it fails on multiple forwards, but this happens on the first forward/predict call
Can you try it with python:
python3 -m pip install torch==1.13.0+cpu -f https://download.pytorch.org/whl/torch_stable.html