Free-Auto-GPT icon indicating copy to clipboard operation
Free-Auto-GPT copied to clipboard

Run with Local LLM Models

Open IntelligenzaArtificiale opened this issue 1 year ago • 14 comments

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

IntelligenzaArtificiale avatar Apr 29 '23 22:04 IntelligenzaArtificiale

Does this project support third-party OpenAI interfaces (such as poe.com)? If it does, are there any other requirements for these interfaces, such as message format, context memory, and number of conversations?

wingeva1986 avatar Apr 30 '23 08:04 wingeva1986

@wingeva1986 Previously this repository was based on the API provided by xtkkey/gpt4free . The problem was that (rightly so) some API went down every day. And our repository was flooded with issues not related to the project but to the cracked xtekky API. At the moment the solution based on Free and Legal calls to chat.openai.com is the most stable solution.

You could try to apply reverse engineering to sites or portals in a legal way. For example HuggingChat is a free service and open to all. It would be interesting to find the huggingChat endpoint and integrate it into the project.

IntelligenzaArtificiale avatar Apr 30 '23 10:04 IntelligenzaArtificiale

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/blob/main/ggml-vicuna-7b-1.1-q4_0.bin

HirCoir avatar May 03 '23 16:05 HirCoir

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

We can't require llama models to be as competitive as GPT, keep in mind that the response depends on the number of parameters of the trained model... I've tried many models in my language, and they all generate stupid responses, like the GPT4ALL model based on parrot, alpaca. I have tested the Vicuna 13b Quantized model and let me tell you that despite having a weight of 4 GB, it is capable of maintaining a fluent conversation and consuming less resources... I am running it on a 4-core ARM Ampere server, with 32GB of RAM and it uses more CPU than RAM and is able to respond correctly. I also managed to implement it to a WhatsApp chat using the Bayleis library. If you are interested in testing the model, I could give you access to my server so you can try... It's not spam but look for the videos on youtube putting my name and you will find a tutorial where I put Llama.cpp and Alpaca.cpp to the test on two servers with the same hardware.

I made this answer using the translator, my native language is Spanish.

HirCoir avatar May 03 '23 17:05 HirCoir

Have you tested mosaicml/mpt-7b-chat, or mosaicml/mpt-7b-instruct? Seems promising

Therealkorris avatar May 07 '23 23:05 Therealkorris

@Therealkorris We haven't tried it yet but we believe that mpt-7b-instruct and Lamini-gpt can give better results than other opensource models.

Have you already managed to implement a pipeline to generate text with mpt-7b-instruct ? If yes, what hardware do you have? Do you want to share your Pipeline?

IntelligenzaArtificiale avatar May 08 '23 12:05 IntelligenzaArtificiale

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API. we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination. Any advice for LLM models with fine tuning for high performance instructions?

https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/blob/main/ggml-vicuna-7b-1.1-q4_0.bin

@HirCoir Have you already implemented a pipeline to generate the text? What hardware does it run on?

IntelligenzaArtificiale avatar May 08 '23 12:05 IntelligenzaArtificiale

What do you think about : Cerebras

https://huggingface.co/cerebras

sambickeita avatar May 08 '23 21:05 sambickeita

Any other LLM model support? Trying to use new mega13b

GoZippy avatar May 17 '23 00:05 GoZippy

@GoZippy @wingeva1986 @Therealkorris @HirCoir We all know more or less open source models. The problem is that a new one comes out every day. Most lack the performance of GPT3 .

if you want to help us, share here the code to implement an inference with the models you recommend, so that we can test them easily.

for example , @GoZippy , share us your code that you use to do the inference on the mega13b model.

So we create a custom llm wrapper with langchain and run Autogpt , if it gives good results we upload everything to the repository ❤

thanks for the help

IntelligenzaArtificiale avatar May 17 '23 12:05 IntelligenzaArtificiale

https://github.com/oobabooga/text-generation-webui

prehcp avatar May 19 '23 00:05 prehcp

currently, Starling is the best 7B model to date: https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF

Tempaccnt avatar Apr 03 '24 00:04 Tempaccnt

Any progress on this? I'll be home shortly and will look into this again but have been using other tools as of late... I lost track of where autogpt was going with all the forge stuff... A year ago...

GoZippy avatar Apr 03 '24 02:04 GoZippy

I'm the same, I have been too busy so I stopped keeping up. but recently, I found an AI agent called evo.ninja it has workspace and great interface and currently it's ranked as the top autoGPT agent. unfortunately, it requires OpenAI API.

so I looked into alternatives and this is how I came here

Tempaccnt avatar Apr 03 '24 02:04 Tempaccnt