avianion

Results 10 issues of avianion

The list is great, but not for use in production. If you want to use it live, it's a huge 2.4mb file - which even gzipped slows down ones site....

### Description When handling errors in the stream, it breaks the entire stream and gives an error coming. Is there a better way to do this? Because it removes the...

I loved this module, but it no longer works as only V4 signing works now. Is it possible for someone to update this to V4 signing? Thanks so much!

Is it possible to increase the amount of tokens sent per chunk during the streaming process and how to do so? This could also be with triton-inference-server

will this project plan to support llama 3 70b or 8b?

The official llama 3 70b instruct repo has updated the eos token "eos_token": "", Yet when using this library and using that eos token, no output is outputted because it...

Is it possible to increase the amount of tokens sent per chunk during the streaming process and how to do so? This could also be with triton-inference-server

question
triaged

https://arxiv.org/abs/2404.15420 "In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference information. Just-in-time processing of a context is inefficient due to the quadratic cost of...

**Describe the bug** Tokens per second is currently calculated including the latency since the beginning of the API request and or hitting the start button. However, tokens per second should...

type: bug

All other images like VLLM etc are available, but not this one. What gives? 24.05-py3-min 24.05-py3-sdk 24.05-pyt-python-py3 24.05-tf2-python-py3 24.05-vllm-python-py3 24.05-py3 24.05-py3-igpu-min 24.05-py3-igpu-sdk 24.05-py3-igpu But no 24.05-trtllm-python-py3 ??