djl Failed to find pytorch-native-cu128 from maven repo

Description

I am going to update the DJL version to 0.34.0. I see in the document that the pytorch-native GPU version has been updated to pytorch-native-cu128. However, after updating the pom file, I cannot find this jar package from https://repo1.maven.org/maven2/ai/djl/pytorch/pytorch-native-cu128/2.7.1/pytorch-native-cu128-2.7.1.pom.

https://repo1.maven.org/maven2/ai/djl/pytorch/ Here it shows that the latest jar is still pytorch-native-cu124.

Forgot to publish it?

Expected Behavior

Release pytorch-native-cu128 with pytorch 2.7.1

Aug 27 '25 03:08 jestiny0

We use oss sonatype to publish to maven. sonatype recently migrated to maven central. We lost ability to publish cuda jar file due to file size limit (1G). Waiting for sonatype to white list the package and remove the limit.

Aug 27 '25 04:08 frankfliu

@frankfliu I have the same issue. Does that mean DJL 0.34.0 currently cannot use GPU? Is there any temporary workaround?

Sep 05 '25 02:09 geekwenjie

@geekwenjie

You can use PyTorch with GPU if you have internet access. You don't need to include pytorch-native-cu128, DJL by default will download PyTorch at runtime.

Sep 05 '25 14:09 frankfliu

Waiting for sonatype to white list the package and remove the limit.

Any updates? @frankfliu

Sep 16 '25 08:09 jestiny0

Is there any other place where I can get this jar file?

Sep 23 '25 07:09 syskin345

pytorch-native-cu128 with pytorch 2.7.1 这个包啥时候能正常下载呀。能不能给个临时下载。 @frankfliu

Oct 27 '25 10:10 liuyongchao940

We need this library; please upload it as soon as possible.

Nov 25 '25 03:11 Observer2142

I tried it, and it does download automatically. I’ll share the package with you：https://pan.baidu.com/s/1i09_a9AhVS941rXX2G5mpA?pwd=1234 提取码: 1234

Nov 25 '25 03:11 geekwenjie

You can use PyTorch with GPU if you have internet access. You don't need to include pytorch-native-cu128, DJL by default will download PyTorch at runtime. Thanks! it work

Nov 25 '25 07:11 Observer2142

I tried it, and it does download automatically. I’ll share the package with you：https://pan.baidu.com/s/1i09_a9AhVS941rXX2G5mpA?pwd=1234 提取码: 1234

同样也感谢你~ 我刚刚才发现 frankfliu 提及的方法 Thank you too

Nov 25 '25 07:11 Observer2142

We use oss sonatype to publish to maven. sonatype recently migrated to maven central. We lost ability to publish cuda jar file due to file size limit (1G). Waiting for sonatype to white list the package and remove the limit.

One option could be to distribute djl with Vulkan.

For instance, inference focused Koboldcpp does so with their no-cuda version Release: https://github.com/LostRuins/koboldcpp/releases/tag/v1.101.1 You can see that without cuda, the size is much smaller:

Unlike Cuda, which is mainly geared towards Nvidia, Vulkan aims to support a broad range of hardware.

Llama.cpp shows that their optimized vulkan implementation is only slightly inferior (in the range of -10% to -50% slower) to cuda in terms of token/second (at least it has been so in 2024/2025). Funny enough, Vulkan is now optimized so well in llama.cpp that it's in some circumstances faster than ROCm. Here is a comparison between Llama.cpp Vulkan vs llama.cpp Cuda.

llama.cpp's is built on top of the tensor library ggml, which can be compiled with various backends (including vulkan), so another option could be to distribute that one too with djl.

I tried to find java bindings for pure vulkan and the first thing that pops up is https://github.com/LWJGL/lwjgl3. It seems to be optimized for games, but who knows, maybe it's easy to adapt.

TL;DR

Distribute Vulkan for inference.

Main advantages:

small size (gets around file size limit)
support for broad spectrum of hardware (Nvidia, AMD, Intel, ...?)

Nov 25 '25 09:11 ThiloteE

Could you share the package for linux, previous shared package is for windows. @geekwenjie

Nov 28 '25 05:11 YangfanZhan