doscherda
Results
2
comments of
doscherda
On your client side, look in llms/vllm/utils.py: def get_response(response: requests.Response) -> List[str]: data = json.loads(response.content) return data["text"] Add an extra print for debug: def get_response(response: requests.Response) -> List[str]: data =...
I think this is related to https://github.com/run-llama/llama_index/issues/12955