Together model/mixtral: validation error for CompletionRequest
Describe the bug
I encountered an error when running the docker container for amd64. The main part of the error is as follows: "WARNING ❌ Failed on pvlib__pvlib-python-i1603: 1 validation error for CompletionRequest stop Input should be a valid list [type=list_type, input_value='', input_type=str] For further information visit https://errors.pydantic.dev/2.6/v/list_type" After this, it attempts to start over but fails again.
Steps/Code to Reproduce
docker run --rm -it \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /root/keys.cfg:/app/keys.cfg \
--platform=linux/amd64 \
klieret/swe-agent-run:latest \
python run.py --image_name=klieret/swe-agent:latest \
--model_name mixtral8x7b \
--data_path https://github.com/pvlib/pvlib-python/issues/1603 \
--config_file config/default_from_url.yaml --skip_existing=False
Expected Results
No error thrown and it working fine
Actual Results
INFO 💽 Loaded dataset from https://github.com/pvlib/pvlib-python/issues/1603
DEBUG Starting container with command: docker run -i --rm --name klieret-swe-agent-latest-5101df22cc klieret/swe-agent:latest /bin/bash -l -m
INFO 🌱 Environment Initialized
INFO ▶️ Beginning task 0
INFO Trying to clone from non-mirror...
WARNING install_environment is set to True, but the data path is a GitHub URL. Skipping conda environment installation.
INFO Initializing agent settings for container 7bb69df00f4ccf653e8699a8366d1121ab65906e481bf55142c79e6260fd6bc7
2024-04-08 06:46:23,320 - api_models - INFO - Resetting model stats
INFO SYSTEM (primary)
SETTING: You are an autonomous programmer, and you're working directly in the command line with a special interface.
The special interface consists of a file editor that shows you 100 lines of a file at a time.
In addition to typical bash commands, you can also use the following commands to help you navigate and edit files.
COMMANDS:
open:
docstring: opens the file at the given path in the editor. If line_number is provided, the window will be move to include that line
signature: open <path> [<line_number>]
arguments:
- path (string) [required]: the path to the file to open
- line_number (integer) [optional]: the line number to move the window to (if not provided, the window will start at the top of the file)
goto:
docstring: moves the window to show <line_number>
signature: goto <line_number>
arguments:
- line_number (integer) [required]: the line number to move the window to
scroll_down:
docstring: moves the window down {WINDOW} lines
signature: scroll_down
scroll_up:
docstring: moves the window down {WINDOW} lines
signature: scroll_down
create:
docstring: creates and opens a new file with the given name
signature: create <filename>
arguments:
- filename (string) [required]: the name of the file to create
submit:
docstring: submits your current code and terminates the session
signature: submit
search_dir:
docstring: searches for search_term in all files in dir. If dir is not provided, searches in the current directory
signature: search_dir <search_term> [<dir>]
arguments:
- search_term (string) [required]: the term to search for
- dir (string) [optional]: the directory to search in (if not provided, searches in the current directory)
search_file:
docstring: searches for search_term in file. If file is not provided, searches in the current open file
signature: search_file <search_term> [<file>]
arguments:
- search_term (string) [required]: the term to search for
- file (string) [optional]: the file to search in (if not provided, searches in the current open file)
find_file:
docstring: finds all files with the given name in dir. If dir is not provided, searches in the current directory
signature: find_file <file_name> [<dir>]
arguments:
- file_name (string) [required]: the name of the file to search for
- dir (string) [optional]: the directory to search in (if not provided, searches in the current directory)
edit:
docstring: replaces lines <start_line> through <end_line> (inclusive) with the given text in the open file. The replacement text is terminated by a line with
only end_of_edit on it. All of the <replacement text> will be entered, so make sure your indentation is formatted properly. Python files will be checked for
syntax errors after the edit. If the system detects a syntax error, the edit will not be executed. Simply try to edit the file again, but make sure to read the
error message and modify the edit command you issue accordingly. Issuing the same command a second time will just lead to the same error message again.
signature: edit <start_line>:<end_line>
<replacement_text>
end_of_edit
arguments:
- start_line (integer) [required]: the line number to start the edit at
- end_line (integer) [required]: the line number to end the edit at (inclusive)
- replacement_text (string) [required]: the text to replace the current selection with
Please note that THE EDIT COMMAND REQUIRES PROPER INDENTATION.
If you'd like to add the line ' print(x)' you must fully write that out, with all those spaces before the code! Indentation is important and code that is
not indented correctly will fail and require fixing before it can be run.
RESPONSE FORMAT:
Your shell prompt is formatted as follows:
(Open file: <path>) <cwd> $
You need to format your output using two fields; discussion and command.
Your output should always include _one_ discussion and _one_ command field EXACTLY as in the following example:
DISCUSSION
First I'll start by using ls to see what files are in the current directory. Then maybe we can look at some relevant files to see what they look like.
```
ls -a
```
You should only include a *SINGLE* command in the command section and then wait for a response from the shell before continuing with more discussion and commands. Everything you include in the DISCUSSION section will be saved for future reference.
If you'd like to issue two commands at once, PLEASE DO NOT DO THAT! Please instead first submit just the first command, and then after receiving a response you'll be able to issue the second command.
You're free to use any other bash commands you want (e.g. find, grep, cat, ls, cd) in addition to the special commands listed above.
However, the environment does NOT support interactive session commands (e.g. python, vim), so please do not invoke them.
INFO DEMONSTRATION:
trajectories/demonstrations/replay__marshmallow-code__marshmallow-1867__default__t-0.20__p-0.95__c-2.00__install-1___install_from_source/marshmallow-code__marshma llow-1867.traj
INFO 🤖 MODEL INPUT
We're currently solving the following issue within our repository. Here's the issue text:
ISSUE:
golden-section search fails when upper and lower bounds are equal
**Describe the bug**
I was using pvlib for sometime now and until now I was always passing a big dataframe containing readings of a long period. Because of some changes in our
software architecture, I need to pass the weather readings as a single reading (a dataframe with only one row) and I noticed that for readings that GHI-DHI are
zero pvlib fails to calculate the output and returns below error while the same code executes correctly with weather information that has non-zero GHI-DHI:
```python
import os
import pathlib
import time
import json
from datetime import datetime
from time import mktime, gmtime
import pandas as pd
from pvlib import pvsystem
from pvlib import location as pvlocation
from pvlib import modelchain
from pvlib.temperature import TEMPERATURE_MODEL_PARAMETERS as PARAMS # not used -- to remove
from pvlib.bifacial.pvfactors import pvfactors_timeseries
from pvlib.temperature import TEMPERATURE_MODEL_PARAMETERS
class PV:
def pv_transform_time(self, val):
# tt = gmtime(val / 1000)
tt = gmtime(val)
dd = datetime.fromtimestamp(mktime(tt))
timestamp = pd.Timestamp(dd)
return timestamp
def __init__(self, model: str, inverter: str, latitude: float, longitude: float, **kwargs):
# super().__init__(**kwargs)
temperature_model_parameters = TEMPERATURE_MODEL_PARAMETERS["sapm"][
"open_rack_glass_glass"
]
# Load the database of CEC module model parameters
modules = pvsystem.retrieve_sam("cecmod")
# Load the database of CEC inverter model parameters
inverters = pvsystem.retrieve_sam("cecinverter")
# A bare bone PV simulator
# Load the database of CEC module model parameters
modules = pvsystem.retrieve_sam('cecmod')
inverters = pvsystem.retrieve_sam('cecinverter')
module_parameters = modules[model]
inverter_parameters = inverters[inverter]
location = pvlocation.Location(latitude=latitude, longitude=longitude)
system = pvsystem.PVSystem(module_parameters=module_parameters, inverter_parameters=inverter_parameters,
temperature_model_parameters=temperature_model_parameters)
self.modelchain = modelchain.ModelChain(system, location, aoi_model='no_loss', spectral_model="no_loss")
def process(self, data):
weather = pd.read_json(data)
# print(f"raw_weather: {weather}")
weather.drop('time.1', axis=1, inplace=True)
weather['time'] = pd.to_datetime(weather['time']).map(datetime.timestamp) # --> this works for the new process_weather code and also the old weather file
weather["time"] = weather["time"].apply(self.pv_transform_time)
weather.index = weather["time"]
# print(f"weather: {weather}")
# print(weather.dtypes)
# print(weather['ghi'][0])
# print(type(weather['ghi'][0]))
# simulate
self.modelchain.run_model(weather)
# print(self.modelchain.results.ac.to_frame().to_json())
print(self.modelchain.results.ac)
# good data
good_data = "{\"time\":{\"12\":\"2010-01-01
13:30:00+00:00\"},\"ghi\":{\"12\":36},\"dhi\":{\"12\":36},\"dni\":{\"12\":0},\"Tamb\":{\"12\":8.0},\"WindVel\":{\"12\":5.0},\"WindDir\":{\"12\":270},\"time.1\":{\ "12\":\"2010-01-01 13:30:00+00:00\"}}"
# data that causes error
data = "{\"time\":{\"4\":\"2010-01-01
05:30:00+00:00\"},\"ghi\":{\"4\":0},\"dhi\":{\"4\":0},\"dni\":{\"4\":0},\"Tamb\":{\"4\":8.0},\"WindVel\":{\"4\":4.0},\"WindDir\":{\"4\":240},\"time.1\":{\"4\":\"2 010-01-01 05:30:00+00:00\"}}"
p1 = PV(model="Trina_Solar_TSM_300DEG5C_07_II_", inverter="ABB__MICRO_0_25_I_OUTD_US_208__208V_", latitude=51.204483, longitude=5.265472)
p1.process(good_data)
print("=====")
p1.process(data)
```
Error:
```log
$ python3 ./tmp-pv.py
time
2010-01-01 13:30:00 7.825527
dtype: float64
=====
/home/user/.local/lib/python3.10/site-packages/pvlib/tools.py:340: RuntimeWarning: divide by zero encountered in divide
np.trunc(np.log(atol / (df['VH'] - df['VL'])) / np.log(phim1)))
Traceback (most recent call last):
File "/home/user/workspace/enorch/simulator/simulator_processor/src/pv/./tmp-pv.py", line 88, in <module>
p1.process(data)
File "/home/user/workspace/enorch/simulator/simulator_processor/src/pv/./tmp-pv.py", line 75, in process
self.modelchain.run_model(weather)
File "/home/user/.local/lib/python3.10/site-packages/pvlib/modelchain.py", line 1770, in run_model
self._run_from_effective_irrad(weather)
File "/home/user/.local/lib/python3.10/site-packages/pvlib/modelchain.py", line 1858, in _run_from_effective_irrad
self.dc_model()
File "/home/user/.local/lib/python3.10/site-packages/pvlib/modelchain.py", line 790, in cec
return self._singlediode(self.system.calcparams_cec)
File "/home/user/.local/lib/python3.10/site-packages/pvlib/modelchain.py", line 772, in _singlediode
self.results.dc = tuple(itertools.starmap(
File "/home/user/.local/lib/python3.10/site-packages/pvlib/pvsystem.py", line 931, in singlediode
return singlediode(photocurrent, saturation_current,
File "/home/user/.local/lib/python3.10/site-packages/pvlib/pvsystem.py", line 2826, in singlediode
out = _singlediode._lambertw(
File "/home/user/.local/lib/python3.10/site-packages/pvlib/singlediode.py", line 651, in _lambertw
p_mp, v_mp = _golden_sect_DataFrame(params, 0., v_oc * 1.14,
File "/home/user/.local/lib/python3.10/site-packages/pvlib/tools.py", line 364, in _golden_sect_DataFrame
raise Exception("Iterations exceeded maximum. Check that func",
Exception: ('Iterations exceeded maximum. Check that func', ' is not NaN in (lower, upper)')
```
I have to mention that for now the workaround that I am using is to pass the weather data as a dataframe with two rows, the first row is a good weather data that
pvlib can process and the second row is the incoming weather reading (I can also post that code if you want).
**Expected behavior**
PVlib should have consistent behavior and regardless of GHI-DHI readings.
**Versions:**
```python
>>> import pvlib
>>> import pandas
>>> pvlib.__version__
'0.9.1'
>>> pandas.__version__
'1.4.3'
```
- python: 3.10.6
- OS: Ubuntu 22.04.1 LTS
INSTRUCTIONS:
Now, you're going to solve this issue on your own. Your terminal session has started and you're in the repository's root directory. You can use any bash commands
or the special interface to help you. Edit all the files you need to and run any checks or tests that you want.
Remember, YOU CAN ONLY ENTER ONE COMMAND AT A TIME. You should always wait for feedback after every command.
When you're satisfied with all of the changes you've made, you can submit your changes to the code base by simply running the submit command.
Note however that you cannot use any interactive session commands (e.g. python, vim) in this environment, but you can write scripts and run them. E.g. you can
write a python script and then run it with `python <script_name>.py`.
NOTE ABOUT THE EDIT COMMAND: Indentation really matters! When editing a file, make sure to insert appropriate indentation before each line!
IMPORTANT TIPS:
1. Always start by trying to replicate the bug that the issues discusses.
If the issue includes code for reproducing the bug, we recommend that you re-implement that in your environment, and run it to make sure you can reproduce the
bug.
Then start trying to fix it.
When you think you've fixed the bug, re-run the bug reproduction script to make sure that the bug has indeed been fixed.
If the bug reproduction script does not print anything when it succesfully runs, we recommend adding a print("Script completed successfully, no errors.")
command at the end of the file,
so that you can be sure that the script indeed ran fine all the way through.
2. If you run a command and it doesn't work, try running a different command. A command that did not work once will not work the second time unless you modify it!
3. If you open a file and need to get to an area around a specific line that is not in the first 100 lines, say line 583, don't just use the scroll_down command
multiple times. Instead, use the goto 583 command. It's much quicker.
4. If the bug reproduction script requires inputting/reading a specific file, such as buggy-input.png, and you'd like to understand how to input that file,
conduct a search in the existing repo code, to see whether someone else has already done that. Do this by running the command: find_file "buggy-input.png" If that doensn't work, use the linux 'find' command.
5. Always make sure to look at the currently open file and the current working directory (which appears right after the currently open file). The currently open
file might be in a different directory than the working directory! Note that some commands, such as 'create', open files, so they might change the current open
file.
6. When editing files, it is easy to accidentally specify a wrong line number or to write code with incorrect indentation. Always check the code after you issue
an edit to make sure that it reflects what you wanted to accomplish. If it didn't, issue another command to fix it.
7. It may be necessary to install the repository from source before you can run code. Please think about how to install the environment from the repository
directory if you need to do so.
(Open file: n/a)
(Current directory: /pvlib__pvlib-python)
bash-$
/usr/local/lib/python3.9/site-packages/together/legacy/complete.py:23: UserWarning: The use of together.api_key is deprecated and will be removed in the next major release. Please set the TOGETHER_API_KEY environment variable instead.
warnings.warn(API_KEY_WARNING)
Traceback (most recent call last):
File "/app/run.py", line 136, in main
info, trajectory = agent.run(
File "/app/sweagent/agent/agents.py", line 645, in run
thought, action, output = self.forward(
File "/app/sweagent/agent/agents.py", line 373, in forward
thought, action, output = self.forward_with_error_check(observation, state)
File "/app/sweagent/agent/agents.py", line 513, in forward_with_error_check
output = self.forward_model(observation, state)
File "/app/sweagent/agent/agents.py", line 427, in forward_model
return self.model.query(self.local_history)
File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 379, in __call__
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 325, in iter
raise retry_exc.reraise()
File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 158, in reraise
raise self.last_attempt.result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 382, in __call__
result = fn(*args, **kwargs)
File "/app/sweagent/agent/models.py", line 563, in query
completion = together.Complete.create(
File "/usr/local/lib/python3.9/site-packages/together/legacy/base.py", line 25, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/together/legacy/complete.py", line 28, in create
return client.completions.create(
File "/usr/local/lib/python3.9/site-packages/together/resources/completions.py", line 79, in create
parameter_payload = CompletionRequest(
File "/usr/local/lib/python3.9/site-packages/pydantic/main.py", line 171, in __init__
self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for CompletionRequest
stop
Input should be a valid list [type=list_type, input_value='<human>', input_type=str]
For further information visit https://errors.pydantic.dev/2.6/v/list_type
WARNING ❌ Failed on pvlib__pvlib-python-i1603: 1 validation error for CompletionRequest
stop
Input should be a valid list [type=list_type, input_value='<human>', input_type=str]
For further information visit https://errors.pydantic.dev/2.6/v/list_type
INFO Beginning environment shutdown...
INFO Agent container stopped
System Information
OS: Debian GNU/Linux 12 (bookworm) x86_64 Host: KVM/QEMU (Standard PC (i440FX + PIIX, 1996) pc-i440fx-7.2) Kernel: 6.1.0-18-amd64 Uptime: 1 day, 18 hours, 10 mins Packages: 489 (dpkg), 6 (snap) shell: bash 5.2.15 Resolution: 1280x800 Terminal: /dev/pts/0 CPU: AMD Ryzen 9 7950X3D (6) @ 4.199GHz GPU: 00:02.0 Vendor 1234 Device 1111 Memory: 773MiB / 11960MiB
Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
Just realized its a duplicate of #124
I don't think it's a duplicate of #124
So it seems to have failed in querying the model (model.query), in particular these few lines
prompt = self.history_to_messages(history)
completion = together.Complete.create(
model=self.api_model,
prompt=prompt,
max_tokens=self.model_metadata["max_context"],
stop="<human>",
temperature=self.args.temperature,
top_p=self.args.top_p,
)
what version of the together package are you running?
I'm now working on #185 .
This is because ChatCompletionRequest.stop requires List[str], but feed str.
https://github.com/togethercomputer/together-python/blob/main/src/together/types/chat_completions.py#L83 https://github.com/princeton-nlp/SWE-agent/blob/main/sweagent/agent/models.py#L569
But even if we fix this, another error will be occured.
together.error.InvalidRequestError: Error code: 400 - {"message": "Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4097. Given: 14 `inputs` tokens and 4096 `max_new_tokens`", "type_": "invalid_request_error", "param": "max_tokens", "code": null}
This will work if we modify the max_context of Model.
e.g. https://github.com/princeton-nlp/SWE-agent/blob/main/sweagent/agent/models.py#L496
Next time we will fails with a KeyError for response. It seems like this is because output is missing from ChatCompletionResponse.
https://github.com/togethercomputer/together-python/blob/main/src/together/types/chat_completions.py#L113-L127 https://github.com/princeton-nlp/SWE-agent/blob/main/sweagent/agent/models.py#L574-L576
It looks like we can directly access choices and usage instead.
This series of errors is due to big changes that were introduced in the new major version released last week. If you want to use it right away, rolling back the Together version to v0.2.11 might work well.
I'm now working on https://github.com/princeton-nlp/SWE-agent/issues/185 . This is because ChatCompletionRequest.stop requires List[str], but feed str.
https://github.com/togethercomputer/together-python/blob/main/src/together/types/chat_completions.py#L83 https://github.com/princeton-nlp/SWE-agent/blob/main/sweagent/agent/models.py#L569
thank you. yes, stop is supposed to be a List[str] on the Together API: https://docs.together.ai/reference/chat-completions
But even if we fix this, another error will be occured.
together.error.InvalidRequestError: Error code: 400 - {"message": "Input validation error:
inputstokens +max_new_tokensmust be <= 4097. Given: 14inputstokens and 4096max_new_tokens", "type_": "invalid_request_error", "param": "max_tokens", "code": null} This will work if we modify the max_context of Model.e.g. https://github.com/princeton-nlp/SWE-agent/blob/main/sweagent/agent/models.py#L496
Input token count + max_tokens needs to be less than the context length of the model. Model context lengths can be found here.
You can also set max_tokens=None to avoid the error if you don't know how many input tokens you have.
Next time we will fails with a KeyError for response. It seems like this is because output is missing from ChatCompletionResponse.
https://github.com/togethercomputer/together-python/blob/main/src/together/types/chat_completions.py#L113-L127 https://github.com/princeton-nlp/SWE-agent/blob/main/sweagent/agent/models.py#L574-L576
It looks like we can directly access choices and usage instead.
This series of errors is due to big changes that were introduced in the new major version released last week. If you want to use it right away, rolling back the Together version to v0.2.11 might work well.
Yes. v1.0.0 brought in breaking changes. Hardcoding v0.2.11 until these changes have been made to this repo would be my suggestion. v1.2.0 will have backwards compatibility to v0.2.11.
p.s. I'm a maintainer of the Together Python SDK.
@orangetin Thank you very much for providing the information! It's really helpful!
In particular, below your comment will help me fix the issue quickly!
You can also set max_tokens=None to avoid the error if you don't know how many input tokens you have.