Yaron Rosenbaum
Yaron Rosenbaum
Hi I'm exploring running your docker setup in a cross-datacenter cluster. Would it be possible to expose the following params as -e environment variables for the docker container? cluster_name: ''...
Logviewer ?
Hi I would like to access the storm logviewer on one of the supervisors. How do I do that? (AFAIK the supervisors are started by mesos) Thanks
InnoDB error
Hi buddy, First - thanks for taking the effort and putting this repository in place. I'm looking for a quick way to test out VOIP for my home. I ran...
### What happened? Hi I followed the instructions here: https://docs.litellm.ai/docs/providers/vllm My relevant config is: ` - model_name: Mistral-7B-Instruct-v0.2 litellm_params: model: vllm/mistralai/Mistral-7B-Instruct-v0.2 api_base: http://Mistral-7B-Instruct-v0.2.mycloud.local:8000 api_key: fake-key` Queries fail. "No module named...
## Description Running djl-inference:0.27.0-neuronx-sdk2.18.1, with Huggingface model google/gemma-7b-it fails. ### Error Message WARN PyProcess W-93-model-stderr: --- Logging error --- WARN PyProcess W-93-model-stderr: Traceback (most recent call last): WARN PyProcess W-93-model-stderr:...
### 🚀 The feature, motivation and pitch It seems like the current docker images don't support Neuron (Inferentia). It would be very helpful if there was a tested, managed Neuron...
### Your current environment root@9c92d584ab5f:/app# python3 ./collect_env.py Collecting environment information... WARNING 05-15 15:13:52 ray_utils.py:46] Failed to import Ray with ModuleNotFoundError("No module named 'ray'"). For multi-node inference, please install Ray with...
Running the benchmark script on a llama-3-8b-inst on inferentia 2 (djl-serving) results in: ``` python3.10 token_benchmark_ray.py \ --model "openai/llama3-8b-inst" \ --mean-input-tokens 550 \ --stddev-input-tokens 150 \ --mean-output-tokens 150 \ --stddev-output-tokens...
### Your current environment Docker Image: vllm/vllm-openai:v0.4.3 as well as 0.5.0 post-1 Params: ``` --model=microsoft/Phi-3-medium-4k-instruct --tensor-parallel-size=2 --disable-log-requests --trust-remote-code --max-model-len=2048 --gpu-memory-utilization=0.9 ``` The container freezes (does nothing) after presenting the following...
## Description Unable to use open-ai endpoint, getting the error below. ### Error Message PyProcess W-100-model-stdout: The following parameters are not supported by neuron with rolling batch: {'frequency_penalty'}. ## How...