SWE-agent Support for Deepseekcoder/Deepseek.com

Describe the feature

Currently deepseekcoder is a promising opensource model to experiment with SWE-agent which could be integrated via an OpenAI API and deepseek.com offers 10M tokens free, making it easier to get started with SWE-agent...

from openai import OpenAI

client = OpenAI(api_key="<deepseek api key>", base_url="https://api.deepseek.com/v1")

# get the list of models
for model in client.models.list().data:
    print(model)

# retrieve info of a specific model
print(client.models.retrieve('deepseek-coder'))

Potential Solutions

Should this be implemented in sweagent/agent/models.py or a bit of refactoring should be in order to accommodate upcoming APIs while maintaining cost tracking?

Apr 11 '24 14:04 moresearch

If there is a way to add this without too much overhead, feel free to open a PR. However, note that currently, only the strongest models deliver good enough results, so there's no point in adding support for "smaller" models or models that aren't (almost) on-par with gpt4

Apr 11 '24 14:04 klieret

@klieret I agree there's no point in adding support for tiny 7b models etc. However, Deep seek V2 (236B params...) shows really good results. Aider, another open-source tool similar (well, more or less) to SWE-agent, recently posted a benchmark that actually shows that Deep seek V2 is almost as good as ChatGPT 4. This is close to my personal experience. And it's definitely better than Ollama 3 70B because it supports huge context window up to 128k. And it's simply has 3x more params, which is surely not always the indicator of quality, but often it is. If it matters, I do believe it's better to add support for their V2 model (the largest one, with 236B params) rather than the "coder" one. When it comes to analyzing existing code, it shows better results than the "coder" model - according to Aider

May 08 '24 16:05 andr0s

We're going to add something like LiteLLM integration soon so this issue will be unneeded. Clsoing

May 10 '24 20:05 ofirpress

SWE-agent SWE-agent copied to clipboard

Support for Deepseekcoder/Deepseek.com

Describe the feature

Potential Solutions

SWE-agent
SWE-agent copied to clipboard