doscherda

Results 2 comments of doscherda

On your client side, look in llms/vllm/utils.py: def get_response(response: requests.Response) -> List[str]: data = json.loads(response.content) return data["text"] Add an extra print for debug: def get_response(response: requests.Response) -> List[str]: data =...

I think this is related to https://github.com/run-llama/llama_index/issues/12955