tsvisab

Results 11 comments of tsvisab

This is a problem when using managed services that use a container to run the model for you such as Vertex AI or Sage maker since container is started with...

If you're running the model with a container, specify enough size for shared memory via `--shm-size` arg then, within your container: ray.init(_temp_dir="/dev/shm/tmp_or_whatever", num_gpus=NUM_GPUS, ...)

I experience that too also when running llama3 8b with SamplingParams(n=1, ...) and call model in parallel.. in general, i think it relates to this [issue](https://github.com/vllm-project/vllm/issues/4556) so it's something with...

Update: the same exact error happens when intializing the `meta-llama/Meta-Llama-3-8B-Instruct` with only a single GPU - model is served with FastApi + uvicorn - sequential calls work fine, call in...

Ok, so figures i wasn't using vllm properly, loading model with LLM isn't supporting async requests, maybe the lib should be more clear about that, should load with `AsyncLLMEngine`

Hey @sigridjineth , regarding you "stuck init" issue, how are you starting your container? are you by any chance running the container using Sagemaker or Vertex AI? in any case,...

> I also encountered the same error. This happens because (#2816) the `peft` library saves the base embedding layers as well when `save()` is called - https://github.com/huggingface/peft/blob/8dd45b75d7eabe7ee94ecb6a19d552f2aa5e98c6/src/peft/utils/save_and_load.py#L175. This is not...

The symbol "_ZN2at4_ops15sum_IntList_out4callERKNS_6TensorEN3c1016OptionalArrayRefIlEEbSt8optionalINS5_10ScalarTypeEERS2_" is a mangled CPP function name, to demangle it use this [Demangler tool](http://demangler.com/) the function is `at::_ops::sum_IntList_out::call(at::Tensor const&, c10::OptionalArrayRef, bool, std::optional, at::Tensor&)` So what happens is that...

Waiting for this, any chance it's going to be merged ? :)

Thanks! this definitely does something, when i use "model" : "something that does not exists" it acts as the base model but when i use the adapter key (i.e: `my_adapter`...