Philipp Schmid
Philipp Schmid
Can you please share the versions you have installed?
What versions of the libraries do you use? What sequence length do you use?
Hello @dylan-stark, Currently there is no way to [customize the `InferenceClient`](https://github.com/philschmid/easyllm/blob/a651e9dc28168441276ab3f9d5b1c3c2765ae735/easyllm/clients/huggingface.py#L165). If you want to can create a PR with a recommendation on how to add those.
Can you please provide more information? Like the error you get, code to reproduce the error etc.
What error are you getting? can you please share it?
Let me look at that but. You should be able to install `pip install easyllm[bedrock]`
I pushed an updated version.
What llama model size are you using?
Did you make any other change to the code than the model id? What GPU are you using?
This most likely due to the fact the inference API caching the requests.