djl-serving docker 0.29.0-pytorch-inf2 with meta-llama/Meta-Llama-3.1-8B-Instructn failes

Description

Unable to use open-ai endpoint, getting the error below.

Error Message

PyProcess W-100-model-stdout: The following parameters are not supported by neuron with rolling batch: {'frequency_penalty'}.

How to Reproduce?

Using Docker. "image": "deepjavalibrary/djl-serving:0.29.0-pytorch-inf2" "envVars": "AWS_NEURON_VISIBLE_DEVICES=ALL OPTION_TENSOR_PARALLEL_DEGREE=max HF_HOME=/tmp/.cache/huggingface OPTION_MODEL_ID=meta-llama/Meta-Llama-3.1-8B-Instruct
OPTION_ENTRYPOINT=djl_python.transformers_neuronx OPTION_TRUST_REMOTE_CODE=true SERVING_LOAD_MODELS=test::Python=/opt/ml/model OPTION_ROLLING_BATCH=auto OPTION_ENABLE_CHUNKED_PREFILL=true OPTION_MAX_ROLLING_BATCH_SIZE=32 OPTION_N_POSITIONS=8192 OPTION_MAX_BATCH_DELAY=500 DJL_CACHE_DIR=/tmp/.cache/ ",

Sep 13 '24 13:09 yaronr

PyProcess W-100-model-stdout: The following parameters are not supported by neuron with rolling batch: {'frequency_penalty'}. This is just a warning. We would not fail because of this. Do you have any other error messages in the log?

Oct 10 '24 21:10 sindhuvahinis

This issue is stale because it has been open for 30 days with no activity.

Nov 06 '25 19:11 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Nov 21 '25 19:11 github-actions[bot]