Todsaporn Banjerdkit
Todsaporn Banjerdkit
I trying now and it's working just fine
Sorry, I didn't develop things with AIR anymore, so you need more than lucky to get support.
Hi, At first I think It should be easier to update via docker compose update policy, and usually docker hub should hook with github and will trigger auto build so...
Maybe consider use https://github.com/rustformers/llama-rs in that case? and also https://github.com/sobelio/llm-chain for langchain.
> It would be pretty cool to add this and perhaps not too difficult. I believe function calling requires a few things: > > * A model which supports the...
UPDATE: It seem like I can use `client.query` instead of `client.search` maybe this is a way to fix this? ```python results = client.query( collection_name="knowledge-base", query_text=prompt, limit=3, ) results ```
> This is likely due to an OOM on your GPU. Which models are you using? Exactly as demo `meta-llama/Meta-Llama-3-8B-Instruct`. Anyway when I try to use `GGUF`. ``` "model": "second-state/Llama-3-8B-Instruct-GGUF",...
I can reproduce this by run Sprite Sheet example on m3max, it took 66-75% CPU and 34% CPU for `opt-level = 1` Not looking good for just 1 sprite animation.
> Can you try with `opt-level=3` or just `--release`? `opt-level=1` is barely any optimization and isn't a good indicator of real performance. `opt-level=1` 👉 34% CPU `opt-level=3` 👉 36% CPU...
Aww, This just bite me.