aider icon indicating copy to clipboard operation
aider copied to clipboard

Ability to swap model during session

Open typpo opened this issue 10 months ago • 4 comments

Thanks so much for creating this tool.

I sometimes need to switch the model I'm using, usually when the context window gets too large. For example, someone might switch to the more expensive 16k or 32k versions of GPT 3 or 4 after hitting a token limit.

A /model <modelname> command might be a good way to do this.

typpo avatar Aug 15 '23 16:08 typpo

Thanks for trying aider!

Which models are you wanting to switch from and to?

In some cases, it would be very hard to change the model mid-conversation. Changing from gpt-4 to gpt-3.5 or vice-versa would also require changing the "edit format" and rewriting all of the conversation history. Changing between different versions of 3.5 or between different versions of 4 might be possible.

But in practice I'm not sure how useful this feature would be? gpt-3.5-16k is pretty inexpensive, so why not just use it always? And for gpt-4, very few people have access to gpt-4-32k, so almost no one would have the ability to increase the context window size mid-conversation.

paul-gauthier avatar Aug 16 '23 20:08 paul-gauthier

Maybe I'm one of the lucky ones. I normally work with gpt-4 but sometimes want to switch to gpt-4-32k.

I had a look at the code the other day and came to the same conclusion - doesn't seem too bad to swap different versions of the same model, but swapping between models is tricky.

typpo avatar Aug 16 '23 20:08 typpo

I have been mulling on a more versatile /set feature which would let you switch out a subset of the params you can configure on startup and reboot the relevant parts of aider while keeping your current chat history and file context.

model, edit-format, map-tokens, pretty, no-pretty, stream, no-stream, show-diffs, auto-commits, verbose were the ones I had in mind.

joshuavial avatar Sep 06 '23 03:09 joshuavial

(copying this comment from discord)

It seems like a nice to have feature.

The real work is looking at each cmd line switch and deciding which bin it falls into:

  1. Trivial to dynamically change at runtime. Something like --verbose or --map-tokens.

  2. Possible to dynamically change at runtime, with special extra work when it changes. Something like --model could be made dynamic, as long as you only switch between identical models with different context windows. Or if you always discard the conversation history when you switch model (but is this really dynamic then). Other switches could have similar special limitations or requirements.

  3. Impossible to sensibly make dynamic, like --openai-api-base.

After we do that binning, is it still worth adding a /set command? Or are there too few useful switches that we're willing to do the work to support? Or are we making a big complex machine to support the more complex switches (like --model) which will slow down and complicate future innovative work?

paul-gauthier avatar Sep 08 '23 15:09 paul-gauthier

This looks like a duplicate of #402, so I'm going to close it so discussion can happen there. Please let me know if you think it's actually a distinct issue.

paul-gauthier avatar Apr 29 '24 19:04 paul-gauthier