aider
aider copied to clipboard
Support for other LLMs, local LLMs, etc
This issue is a catch all for questions about using aider with other or local LLMs. The text below is taken from the FAQ.
Aider provides experimental support for LLMs other than OpenAI's GPT-3.5 and GPT-4. The support is currently only experimental for two reasons:
- GPT-3.5 is just barely capable of editing code to provide aider's interactive "pair programming" style workflow. None of the other models seem to be as capable as GPT-3.5 yet.
- Just "hooking up" aider to a new model by connecting to its API is almost certainly not enough to get it working in a useful way. Getting aider working well with GPT-3.5 and GPT-4 was a significant undertaking, involving specific code editing prompts and backends for each model and extensive benchmarking. Officially supporting each new LLM will probably require a similar effort to tailor the prompts and editing backends.
Numerous users have done experiments with numerous models. None of these experiments have yet identified other models that look like they are capable of working well with aider.
Once we see signs that a particular model is capable of code editing, it would be reasonable for aider to attempt to officially support such a model. Until then, aider will simply maintain experimental support for using alternative models.
More information
For more information on connecting to other models, local models and Azure models please see the FAQ.
There are ongoing discussions about LLM integrations in the aider discord.
Here are some GitHub issues which may contain relevant information.
Why not have a unified API that you could provide then a plugin system to integrate other LLMS, then you can just provide gpt-3.5 and gpt-4 plugins officially yourself.
Per the FAQ and info above in the issue, there are already some useful hooks for connecting to other LLMs. A number of users have been using them to experiment with local LLMs. Most reports haven't been very enthusiastic about how they perform with aider compared to GPT-3.5/4.
Aider does have a modular system for building different "coder" backends, allowing customization of prompts and edit formats.
So all the raw materials seem to be available to get aider working with specific new models. I'm always happy to advise as best I can. And if you see evidence that a particular model has the potential to work well with aider, that would exciting news.
Great work, has anyone confirmed a successful usage with other local LLMs? It's not about cheapness, we still don't have access to ChatGPT API nor being able to pay for any alternatives.
hello! great news! there is another contender in the ecosystem of open source llm-coders
Yesterday I tested quantized version https://huggingface.co/TheBloke/NewHope-GGML (The bloke also released GPTQ version). Running in local using ooba textgen and activating opeanai extension. What I read is that is a llama2 model finetuned similar to wizardcoder
Seems to be better than wizardcoder but still needs effort to adjust prompts to run and be usable. Upsss... The group that originally released the model had to remove it because they realized that some data with which the quality was evaluated had slipped into the data with which they trained the model, so that they were giving comparative results that they weren't real. Even so, the quantized model of the_block is still there and can be downloaded and tested.
The NewHope model was retracted because it was contaminated with test data causing overfit.
https://twitter.com/mathemagic1an/status/1686814347287486464?s=46&t=hIokEbug9Pr72tQFuXVULA
So far none of the models come close to gpt and cannot follow instructions well enough to work with aider.
@aldoyh
#170 might be of interest - are you able to access the openrouter apis?
Hey @paul-gauthier, while this question isn’t directly related to using other LLMs, I was wondering your advice for where to poke around to embed additional context into the prompt.
My friend and I are putting together a context embedding for some relevant up to date developer documentation and would love to try aider in conjunction with that context.
Thanks for this tool!
To reply I had to research that and it's my answer.
No, didn't know about it and going to check it out.. but even if I deployed a LocalAI, isn't there a way to make Aider
look at it?
@aldoyh Have a look at my comments in https://github.com/paul-gauthier/aider/issues/138
Appreciate your work on this project. This represents one of the biggest missing pieces between LLM and real code writing utility, so well done.
I read the bits about how hard it is to add new models, I just want to request you take a look at Claude 2. v2 added some big improvements on the code side, and while it is still dumber than GPT-4, it has a more recent knowledge cutoff. Plus massive context.
I understand you are working around context limitations with ctags, but it could be interesting to see if there is an advantage to being able to load the entire project in context with Claude. For example, it may be better at answering high level questions, or writing features that are described in more abstract terms. But regardless, I think that Claude is hot on the heels of GPT-4, and if the reporting on it being a 52B model is true then it is already significantly smarter (pound for pound).
Just my 2c anyway
I agree that the Claude models sound like they are the most likely to be capable of working well with aider. I have been waiting for an api key for months unfortunately. My impression is that it is very difficult to get a key, which limits the benefits of integrating Claude into aider. Not many folks could use it.
Paul, perhaps try openrouter, which seem sidestep the key issue, and give access to claude directly..
On Mon, Aug 14, 2023, 6:59 AM paul-gauthier @.***> wrote:
I agree that the Claude models sound like they are the most likely to be capable of working well with aider. I have been waiting for an api key for months unfortunately. My impression is that it is very difficult to get a key, which limits the benefits of integrating Claude into aider. Not many folks could use it.
— Reply to this email directly, view it on GitHub https://github.com/paul-gauthier/aider/issues/172#issuecomment-1677112183, or unsubscribe https://github.com/notifications/unsubscribe-auth/BAW75LIGJJWEWORRNCA4TY3XVIAI5ANCNFSM6AAAAAA3BOLHYA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Paul, perhaps try openrouter, which seem sidestep the key issue, and give access to claude directly..
Yes, I am aware of openrouter. But that is a confusing extra layer to explain to users. Most users won't have direct Claude api access. And I won't be able to test aider directly against the Claude api. It's all sort of workable, but far from ideal.
Paul, perhaps try openrouter, which seem sidestep the key issue, and give access to claude directly..
Yes, I am aware of openrouter. But that is a confusing extra layer to explain to users. Most users won't have direct Claude api access. And I won't be able to test aider directly against the Claude api. It's all sort of workable, but far from ideal.
I like the idea of an easily extendable system, e.g a flag (--bot llama5
) that exports a class with this structure:
export default class LLaMA5 {
requirements = [
{
id: "apiKey",
name: "API Key",
type: "string",
required: true,
}
];
constructor({apiKey}){
}
createConversation() --→ ConvoParams
sendMessage(message, {conversation: ConvoParams, progress, done})
// Other optional methods such as deleteConversation, deleteMessage, editMessage, retryMessage, etc
}
This would be easily inspectable by Aider to check if this bot supports retrying, editing, etc as well as supporting the required parameters.
https://xkcd.com/927/
When it comes to the coding capabilities of local LLMs, I believe that the HumanEval (pass@1) is the most important metric.
The leaderboard lists "Starcoder-16b" as the best open model with a score of 0.336 compared to GPT-4 score of 0.67.
But we also have 2 GPT-3.5 models at 0.48/0.46. But here's the thing, There's also the "WizardLM-70B-V1.0" which no one has added to the leaderboards yet but it actually has a higher score than GPT-3.5 at 0.506.
I don't have a machine powerful enough to run it but I think that with minor tweaking it should perform as well as GPT-3.5
All this being said, I'm not a dev and I haven't tested any of this and honestly don't fully understand all the steps that autonomous agents like Aider take to get it to work.
Just though that I'd mention it in case it's useful to someone.
And Paul, great work with everything here. Really cool
Edit
There's also the WizardCoder-15B-V1.0 with a score of 0.573 which was what I originally came to inform about but somehow forgot along the way while checking sources.
I agree that the Claude models sound like they are the most likely to be capable of working well with aider. I have been waiting for an api key for months unfortunately. My impression is that it is very difficult to get a key, which limits the benefits of integrating Claude into aider. Not many folks could use it.
Hey Paul,
I'd be happy to lend you my API key to use for testing. There's a max of 1 call at a time, so if you can deal with that limitation - all good!
I'd be happy to lend you my API key to use for testing.
Thanks @JamesSKR. I have a loaner API key already. But again, so few people have Claude API access that it's not going to be very impactful to get aider working with Claude. Almost no one could use it. I definitely want to experiment with Claude, but it's not super high priority right now for that reason.
Adding support for the recently released Code llama (perhaps using cria?) would be very interesting imo, what do you think @paul-gauthier?
Hi Paul, thank you for such a great project. Love what you've done so far. I was also wondering if you've tested the PalM API from Google, just wondering if its any good?
@samuelmukoti I tested the palm models a little while working on the openrouter integration - they were ok, but similar to lama needed a bit of coaxing to output responses in a format aider would understand.
Just tested out the main branch with text-generation-webui's openAI API endpoint and it worked right away.
Here it is with Llamacoder:
Wow that’s exciting.. hope you can conduct further tests and share how the performance is compared to gpt3.5
thanks for sharing
I wonder if there's a way to get ctags working 🤔
@sammcj Have you tried asking it to edit code?
@sammcj, how did you setup the API for the Llamacoder? I am interested in giving this a try. Thanks
ollama is by far the easiest way!!
On Fri, Aug 25, 2023 at 17:27 John Vaughan @.***> wrote:
@sammcj https://github.com/sammcj, how did you setup the API for the Llamacoder? I am interested in giving this a try. Thanks
— Reply to this email directly, view it on GitHub https://github.com/paul-gauthier/aider/issues/172#issuecomment-1693954446, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATHQYAGIQ3Z4C7XYOQJEZA3XXEKEPANCNFSM6AAAAAA3BOLHYA . You are receiving this because you commented.Message ID: @.***>
@paul-gauthier Did you have a sample file and prompt that you’d like to provide to compare it to something you’d run against gpt3.5?? I can try it out and provide the results.
@jdvaugha I have a heap of different LLM tools on my home server, but the one I seem to use the most is https://github.com/oobabooga/text-generation-webui, however as mentioned Ollama is a very easy way to get started.
@sammcj Try the tests in the Examples folder. Here is one of them https://github.com/paul-gauthier/aider/blob/main/examples/hello-world-flask.md
FYI You can use llama.cpp to run local models with an openai compatible server with no-or-little code modifications
https://github.com/ggerganov/llama.cpp/discussions/795
I've yet to try but I'm excited to try this with llama code
Just set the OPENAI_API_BASE
environment variable.