Together.ai models are not selectable
Hello! I've been attempting to use together.ai models for this project, and I even checked out the https://github.com/princeton-nlp/SWE-agent/blob/main/sweagent/agent/models.py, but I can't seem to get the models to work. It just keeps giving me an "invalid model name" error.
Traceback (most recent call last):
File "/root/SWE-agent/run.py", line 288, in <module>
main(args)
File "/root/SWE-agent/run.py", line 88, in main
agent = Agent("primary", args.agent)
File "/root/SWE-agent/sweagent/agent/agents.py", line 186, in __init__
self.model = get_model(args.model, args.config._commands + args.config.subroutine_types)
File "/root/SWE-agent/sweagent/agent/models.py", line 712, in get_model
raise ValueError(f"Invalid model name: {args.model_name}")
ValueError: Invalid model name: mixtral8x7b
Indeed it looks like this is missing from the get_model function despite being set up in the model file.
I did a temporary patch for me by just adding
elif args.model_name.startswith("mixtral"):
return TogetherModel(args, commands)
in there as I'm really gonna be only using mixtral and ran into another issue
INFO 💽 Loaded dataset from https://github.com/pvlib/pvlib-python/issues/1603
DEBUG Starting container with command: docker run -i --rm --name swe-agent-ba22acd1d9 swe-agent /bin/bash -l -m
ERROR Unexpected container setup output: Unable to find image 'swe-agent:latest' locally
docker: Error response from daemon: pull access denied for swe-agent, repository does not exist or may require 'docker login': denied: requested
access to the resource is denied.
See 'docker run --help'.
Traceback (most recent call last):
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 265, in _raise_for_status
response.raise_for_status()
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.43/containers/swe-agent-ba22acd1d9/json
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/aurologic/SWE-agent/run.py", line 288, in <module>
main(args)
File "/home/aurologic/SWE-agent/run.py", line 90, in main
env = SWEEnv(args.environment)
File "/home/aurologic/SWE-agent/sweagent/environment/swe_env.py", line 107, in __init__
self._reset_container()
File "/home/aurologic/SWE-agent/sweagent/environment/swe_env.py", line 355, in _reset_container
self._init_container()
File "/home/aurologic/SWE-agent/sweagent/environment/swe_env.py", line 388, in _init_container
self.container_obj = client.containers.get(self.container_name)
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/docker/models/containers.py", line 951, in get
resp = self.client.api.inspect_container(container_id)
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/docker/api/container.py", line 792, in inspect_container
return self._result(
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 271, in _result
self._raise_for_status(response)
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/docker/api/client.py", line 267, in _raise_for_status
raise create_api_error_from_http_exception(e) from e
File "/root/miniconda3/envs/swe-agent/lib/python3.9/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation) from e
docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.43/containers/swe-agent-ba22acd1d9/json: Not Found ("No such container: swe-agent-ba22acd1d9")
Actually, seen that you did a new push and I updated to that and it seems to work now!
Yes, sorry my bad, the default image name was incorrect. I'll take a look at how to get all the TogetherModels in there
This should now be fixed in the development version. Could you check @InfiniteCod3 ?
Yup! Seems to be working fine now.
This should now be fixed in the development version. Could you check @InfiniteCod3 ?
I am having a similar issue while trying codellama through an openai compatible endpoint (litellm -> ollama). Would it be possible to support any model (so the list is not enforced in the code?) as it facilitates the testing of new models
and similar issue with ollama/codellama ->
app-1 | Traceback (most recent call last):
app-1 | File "/app/run.py", line 516, in <module>
app-1 | main(get_args())
app-1 | File "/app/run.py", line 512, in main
app-1 | Main(args).main()
app-1 | File "/app/run.py", line 312, in __init__
app-1 | self.agent = Agent("primary", args.agent)
app-1 | File "/app/sweagent/agent/agents.py", line 244, in __init__
app-1 | self.model = get_model(args.model, args.config._commands + args.config.subroutine_types)
app-1 | File "/app/sweagent/agent/models.py", line 887, in get_model
app-1 | return OllamaModel(args, commands)
app-1 | File "/app/sweagent/agent/models.py", line 577, in __init__
app-1 | super().__init__(args, commands)
app-1 | File "/app/sweagent/agent/models.py", line 117, in __init__
app-1 | raise ValueError(msg)
app-1 | ValueError: Unregistered model (ollama/codellama). Add model name to MODELS metadata to <class 'sweagent.agent.models.OllamaModel'>
llama is supported, see https://princeton-nlp.github.io/SWE-agent/usage/usage_faq/.
If someone wants to start implementing litellm support, we're very happy to merge it, but we're currently working on a few other things before.
Generally, small models will not show good performance on SWE-agent
the problem is not to implement litellm (it is an openai compatible endpoint), the problem is to be flexible in the name of the model in SWE-Agent. It would help if any name would be ok?
for example, I wanted to experiment with codestral from mistral that "should" have good performance, and it is complex. What would be good is to provide or enforce the max context with a command switch (i.e. model degrades very fast when we get closer to the max context)