GenossGPT
GenossGPT copied to clipboard
support MLC-AI/mlc & RWKV ai00_server
in regards of running local LLMs, may I suggest to support non cuda setups as first class citizen.
there a 2 outstanding projects out there, which ignore the every GPU is a Nivida credo and therefor are usable on every other hardware - which is most likely the majority (but is totally ignored)
pls. have a look at: https://github.com/mlc-ai/mlc-llm and https://github.com/BlinkDL/RWKV-LM with the outstanding fast & compact server, which runs 13b-vicuna quant. on an old rx580 with 8GB (via vulkan) https://github.com/cgisky1980/ai00_rwkv_server (full openAI-API support is pending).