Olivier DEBAUCHE
Olivier DEBAUCHE
@abetlen Can you approve testing plz?
> @gaby looks like they were built and uploaded as [artifacts](https://github.com/abetlen/llama-cpp-python/actions/runs/8849638104/artifacts/1451226680) but not added to the release(?) > > I'll take a look later but this is the last [workflow...
@abetlen Test: https://github.com/Smartappli/llama-cpp-python/releases/tag/test2
I avoided the API quota limit problems by adding a timer in my yaml - name: ⌛ rate 1 shell: pwsh run: | # add random sleep since we run...
Yes correct https://levelup.gitconnected.com/deploy-fastapi-with-hypercorn-http-2-asgi-8cfc304e9e7a
@gaby review plz
> [#1342 (comment)](https://github.com/abetlen/llama-cpp-python/issues/1342#issuecomment-2054099460) > > I'll paste my comment here, and maybe we can open a new discussion, basically I'm concerned about the size of releases ballooning with the number...
Not enabling AVX penalizes LLaMa cpp python performance in both cpu and cuda.
I copy that thx @gaby in summary: AVX and AVX2 on CPU is enough
@abetlen workflow update done