Siddharth Venkatesan
Siddharth Venkatesan
You can set it as an environment variable. I will need to update our docs to reflect this configuration.
I am able to reproduce this issue with DJL 0.29.0 (vllm 0.5.3.post1) and DJL 0.30.0 (vllm 0.6.2). I am also able to reproduce this issue with vllm directly, as you...
It does seem like vLLM supports converting a regular AWQ model to marlin format within vllm, but doesn't support a marlin format being directly supplied to vllm. See https://github.com/vllm-project/vllm/issues/7517. Unfortunately...
What is the payload you are using to invoke the endpoint? We do expose generation parameters that can be included in the inference request. Details are in https://docs.djl.ai/master/docs/serving/serving/docs/lmi/user_guides/lmi_input_output_schema.html. We have...