Refactor the API

Open nsarrazin opened this issue 2 years ago • 1 comments

Currently it's a bit of a mess with little to no structure.

I'll be working on making things a bit more structured and expendable.

Mar 22 '23 00:03 nsarrazin

I'd love to see some separation or even the possibility to not run the model with this repo and instead just use the sveltekit app + mongo with an API of our choice (the app looks fantastic by the way).

For example, this project: https://github.com/oobabooga/text-generation-webui lets you run a number of models (including Llama / Alpaca) with optimizations like 8bit and even the new GPTQ/4bit inference so it's possible to run 30B models using around 18GB of VRAM. It has an API that allows you to do generation without using their gradio interface too.

I think it'd also let you iterate faster as you wouldn't have to do so much work on running the model on all platforms and instead focus on the web app.

Mar 23 '23 03:03 LoopControl