djl-serving icon indicating copy to clipboard operation
djl-serving copied to clipboard

A universal scalable machine learning model deployment solution

Results 54 djl-serving issues
Sort by recently updated
recently updated
newest added

## Description ## Adds passthrough support for generation config for on device embedding.

## Description Getting Junk values and number of tokens generated less for starcoderbase model with rolling batch type vllm And also accuracy of generated text is also low. ## Expected...

bug

This PR contains a renovated Actions setup for DJL Serving as promised in https://github.com/deepjavalibrary/djl-serving/pull/1264. The major change is to create a new nightly orchestration action that will call the build,...

## Description (A clear and concise description of what the bug is.) can't install djlbench on aarch64/arm64 Ubuntu platform using snap installer. However, it can be installed by explicitly downloading...

bug

## Description Do you intend to add [Attention Sinks](https://github.com/huggingface/transformers/commit/633215ba58fe5114d8c8d32e415a04600e010701) streaming as an alternative to the current implementations of streaming for huggingface, vllm and scheduler rolling back modes?

enhancement

I am trying to understand whether I am using vLLM for my deployment here with the following settting: ``` option.rolling_batch = auto ``` I can't seem to find whether it...

bug

# Requirement Description serving's support for microservice registries, such as nacos. You can specify the address of the registry at runtime, and djl serving automatically registers with the microservice registry....

enhancement

## Description When enabling streaming with Llama2, Mistral models (models using LlamaTokenizer), this doesn't output appropriate white spaces. For example this outputs text like `DaenerysistheKhaleesi` ### Expected Behavior Streaming output...

bug

This is a refactor to simplify the handling of tensor parallel degree. Before, it is read independently in 3+ locations in code and the behavior determining the tpDegree is hard...

## Description Tokens not streaming not working with rolling batch ### Expected Behavior (what's the expected behavior?) ### Error Message ## How to Reproduce? (If you developed your own code,...

bug