Use the LLaMA model to bypass api restrictions issues
The LLaMA model has been leaked and many people are running it locally on their machine. Maybe we can collectively host this and create an endpoint to use when the chatgpt endpoint gets restricted or saturated here are two projects to look at from @cocktailpeanut and @ggerganov https://github.com/cocktailpeanut/dalai https://github.com/ggerganov/llama.cpp
https://github.com/antimatter15/alpaca.cpp for chat
But it's very computationally expensive. Even more so than gpt-3.5-turbo
But it's very computationally expensive. Even more so than gpt-3.5-turbo
I‘ve seen some people running it on their raspberry pi 🤔 i haven’t looked much further into it
But it's very computationally expensive. Even more so than gpt-3.5-turbo
True. I tried running it, then got a BSoD because of a lack of resources.
I‘ve seen some people running it on their raspberry pi 🤔 i haven’t looked much further into it
Same thing.
How good is LLaMA as compared with ChatGPT?
Besides, one concern is that if more people use a centralized LLM API service, the cost will be lower compared to each individual hosting their own LLM on their own server.