SWE-agent
SWE-agent copied to clipboard
Feature for supplying installation instructions for arbitrary repos
Describe the feature
Right now we have two modes:
- solving a swe-bench issue
- solving an arbitrary GitHub issue.
There's a small difference between these modes:
When we solve a swe-bench issue, we use this file: https://github.com/princeton-nlp/SWE-bench/blob/main/swebench/harness/constants.py
to preinstall all the dependencies for the issue, so that when the agent starts it's trying to solve the issue right away.
For arbitrary github issues, we don't do this. This means the agent has to spend the first few commands on installations. It's totally ok, but it would be better if the user could formally state what the dependencies are and then have them pre-installed in the docker so that the agent is ready to go
Potential Solutions
Let's write a feature where the user is able to specify an environment.yml and then we preinstall all those packages. or actually, to make this not python dependent, let's have them specify a bash script that will run right before starting the agent?
then we have to remember to use the swe-bench demonstration when using this feature and not the demonstration that we're using now for github issues (the github issue demonstration shows how to install dependencies, but we won't need that)
(cc: @klieret who I've started discussing this with)
Another possible solution would be to have a install_script
attribute in the config that takes a filename or raw string that is executed instead of install_env
in the environment setup. This way someone could also just specify a simple installation script as:
pip install numpy pandas
pip install -e .
or even
conda create -f environment.yml -n myenv -y
conda activate myenv
Relatedly, it may be desirable to have a files
attribute that is a map of local files to dirs/filenames that will be moved into the container upon startup. This could then also support installing from an environment file if it doesn't exist in the repo already - or - just moving some relevant data or files into the agent's environment for processing.
Ah, I see, so we'd completely bypass install_env
?
I was thinking of creating an --environment_config
flag that could be either a yaml file with the yaml equivalent of
{
"python": "3.6",
"packages": "pytest",
"pip_packages": "tox",
"install": "python setup.py develop",
}
but perhaps you're right, this would be easier.
How about adding an --environment_setup
and then checking if it ends in .sh
(then it's simply executed) or if it's a yaml file (then it's parsed like the swe-bench config).
Relatedly, it may be desirable to have a files attribute that is a map of local files to dirs/filenames that will be moved into the container upon startup. This could then also support installing from an environment file if it doesn't exist in the repo already - or - just moving some relevant data or files into the agent's environment for processing.
Good idea!