Reezlaw

Results 15 comments of Reezlaw

I think there is a lot of confusion between the built-in gradio api that listens on 7860 and the api extension that listen to 5000, they are mentioned alternatively at...

I (probably wrongly) assumed that this would be covered by the docker build, or anyway that perhaps Oobabooga would want this to work out of the box with the docker...

I checked and the Dockerfile does indeed install GPTQ-for-LLaMa, but from Oobabooga's own repository and not from sterlind

@bgagandeep No, honestly I gave up for now. I still consider it a bug, since the monkey patch is among the available options of the Web UI but doesn't work....

Have you tried _the other_ API? I ended up disabling the API extension and using the same port as the web interface (7850 by default) with the /run/textgen endpoint

If you don't enable the extension there is a builtin one from Gradio (I think), you must not use --chat and it requires --no-stream The "api example" files use this...

If you're using the API extension the default port is 5000 and the endpoint is /api/v1/generate

Isn't the docker-compose method perfectly reproducible?

Oh no, I'm just a random schmuck, I was genuinely asking. Maybe this is a great idea