Muhammad Daniyal comments

Results 9 comments of


                                            Muhammad Daniyal

Agents can't parse OpenAIChat output

Yeah same error with OpenAIChat and `zero-shot-description`

Help to understand

Hi, same question. What exactly is the `model_draft_name`? If I would like to speedup [llama 7b](https://huggingface.co/meta-llama/Llama-2-7b), and I have stored the checkpoints in my local let's say in folder name...

Help to understand

@andreas-solti thanks for the response. I watched the video and understood that it is basically the smaller model which should be relatively fast, so the actual model only has to...

ValueError: Input contains NaN.

@parthsarthi03 I'm facing the same issue. Any update on this? @chenyujiang11 @ExtReMLapin did you guys able to resolve this?

Inquiring about Vector DB implementation

Thanks for the response, got it.

KeyError when managing concurrent requests.

I got it to work using the `AsyncLLMEngine` class. ``` from vllm import AsyncLLMEngine, AsyncEngineArgs engine_args = AsyncEngineArgs( model=model, quantization=quant, enforce_eager=enforce_eager, tensor_parallel_size=tensor_parallel_size, enable_relay_attention=enable_relay_attention, sys_prompt=sys_prompt, sys_schema=sys_schema, sys_prompt_file=sys_prompt_file, sys_schema_file=sys_schema_file ) self.engine =...

Muhammad Daniyal

Agents can't parse OpenAIChat output

Help to understand

Help to understand

ValueError: Input contains NaN.

Inquiring about Vector DB implementation

KeyError when managing concurrent requests.

No Observable Speed Difference Found

No Observable Speed Difference Found

No Observable Speed Difference Found