SWE-agent
SWE-agent copied to clipboard
Support for Deepseekcoder/Deepseek.com
Describe the feature
Currently deepseekcoder is a promising opensource model to experiment with SWE-agent which could be integrated via an OpenAI API and deepseek.com offers 10M tokens free, making it easier to get started with SWE-agent...
from openai import OpenAI
client = OpenAI(api_key="<deepseek api key>", base_url="https://api.deepseek.com/v1")
# get the list of models
for model in client.models.list().data:
print(model)
# retrieve info of a specific model
print(client.models.retrieve('deepseek-coder'))
Potential Solutions
Should this be implemented in sweagent/agent/models.py or a bit of refactoring should be in order to accommodate upcoming APIs while maintaining cost tracking?
If there is a way to add this without too much overhead, feel free to open a PR. However, note that currently, only the strongest models deliver good enough results, so there's no point in adding support for "smaller" models or models that aren't (almost) on-par with gpt4
@klieret I agree there's no point in adding support for tiny 7b models etc. However, Deep seek V2 (236B params...) shows really good results. Aider, another open-source tool similar (well, more or less) to SWE-agent, recently posted a benchmark that actually shows that Deep seek V2 is almost as good as ChatGPT 4. This is close to my personal experience. And it's definitely better than Ollama 3 70B because it supports huge context window up to 128k. And it's simply has 3x more params, which is surely not always the indicator of quality, but often it is. If it matters, I do believe it's better to add support for their V2 model (the largest one, with 236B params) rather than the "coder" one. When it comes to analyzing existing code, it shows better results than the "coder" model - according to Aider
We're going to add something like LiteLLM integration soon so this issue will be unneeded. Clsoing