djl
djl copied to clipboard
An Engine-Agnostic Deep Learning Framework in Java
 I have 2 model folders for llama3, one is the original and another is the finetuned, how to config to use the 2 model folders for djl-serving?
## Description (A clear and concise description of what the bug is.) I use version 0.29.0, JDK environment is 17, operating system is Win11. The libtorch dependencies are loaded properly,...
### Environment Info Container: Docker with NO GPU OS: AlmaLinux CUDA installed: 12.2 Cudnn installed: 8.9.0 djl version: 0.29.0 onnxruntime_gpu version: 1.8.0 ### Error Message ``` [root@r100048367-91051506-l5wvj powerop]# cat /tmp/hs_err_pid1062.log...
Model conversion process failed when deploying Mixtral 8x22B AWQ with djl-tensorrtllm to Sagemaker
## Description Model conversion process failed with djl-tensorrtllm and below serving.properties: ``` image_uri = image_uris.retrieve( framework="djl-tensorrtllm", region=sess.boto_session.region_name, version="0.28.0" ) %%writefile serving.properties engine=MPI option.model_id=MaziyarPanahi/Mixtral-8x22B-Instruct-v0.1-AWQ option.tensor_parallel_degree=4 option.quantize=awq option.max_num_tokens=8192 option.max_rolling_batch_size=8 ``` ### Expected...
## Description Hello DJL Team, I am currently using the SAM2 model from the DJL ModelZoo for inference purposes. However, I have encountered a couple of limitations that I would...
## Description @frankfliu asked I create an issue to track this which was originally reported by me on Slack: https://deepjavalibrary.slack.com/archives/C01AURG857U/p1727308663498229. Can we please add `intfloat/multilingual-e5-large-instruct` to the model zoo? It...
## Description vLLM sampling parameters include a [richer set of values](https://github.com/vllm-project/vllm/blob/c9b45adeeb0e5b2f597d1687e0b8f24167602395/vllm/sampling_params.py), among which `logprobs` has a wider utility. When testing by adding the logpobs option to the request payload, the...
## Description When deploying a Mistral Instruct 7B v.02 on a SageMaker endpoint (ml.g5.12xlarge) using the TensortRT-LLM backend (just-in-time compilation), I noticed that some of the serving parameters get overwritten....
## Description When running examples in 0.30.0-SNAPSHOT I receive an UnsatisfiedLinkError. ### Expected Behavior I expect the examples to run. ### Error Message ``` Failed to load PyTorch native library...