semantra
semantra copied to clipboard
openai.error.InvalidRequestError: '$.input' is invalid.
I tried running it on ~50 files from a grep result, of types:
- csv
- txt
- md
- html
- py
Pressing y for each file to be processed by openai was annoying, so I cancelled and tried again with --no-confirm
and got this error. I then tried again without --no-confirm
and still get the same error:
File "C:\Users\endolith\anaconda3\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\endolith\anaconda3\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "c:\users\endolith\.local\bin\semantra.exe\__main__.py", line 7, in <module>
try:
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\semantra.py", line 619, in main
documents[fn] = process(
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\semantra.py", line 307, in process
flush_pool()
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\semantra.py", line 272, in flush_pool
embedding_results = model.embed(tokens, pool)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\models.py", line 144, in embed
response = openai.Embedding.create(model=self.model_name, input=texts)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_resources\embedding.py", line 33, in create
response = super().create(*args, **kwargs)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_requestor.py", line 298, in request
resp, got_stream = self._interpret_response(result, stream)
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_requestor.py", line 700, in _interpret_response
self._interpret_response_line(
File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_requestor.py", line 763, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: '$.input' is invalid. Please check the API reference: https://platform.openai.com/docs/api-reference.
Ah, it's a specific paper that's causing the error.
https://www.math.union.edu/~dpvc/papers/2001-01.DC-BG-BZ/DC-BG-BZ.pdf
semantra --port 1234 --model openai "DC-BG-BZ.pdf"
Hmm, strange it's working for me. What version of Semantra are you running with (semantra --version
)? I'm on 0.1.7 (you can update with pipx upgrade semantra
)
λ semantra --version
0.1.7
λ pipx runpip semantra list
Package Version
------------------ ------------
aiohttp 3.8.4
aiosignal 1.3.1
annoy-fixed 1.16.3
async-timeout 4.0.2
attrs 23.1.0
blinker 1.6.2
certifi 2023.5.7
charset-normalizer 3.1.0
click 8.1.3
colorama 0.4.6
filelock 3.12.2
Flask 2.3.2
frozenlist 1.3.3
fsspec 2023.6.0
huggingface-hub 0.16.2
idna 3.4
itsdangerous 2.1.2
Jinja2 3.1.2
MarkupSafe 2.1.3
mpmath 1.3.0
multidict 6.0.4
networkx 3.1
numpy 1.25.0
openai 0.27.8
packaging 23.1
Pillow 10.0.0
pip 23.2
pypdfium2 4.18.0
python-dotenv 1.0.0
PyYAML 6.0
regex 2023.6.3
requests 2.31.0
safetensors 0.3.1
semantra 0.1.7
setuptools 68.0.0
sympy 1.12
tiktoken 0.4.0
tokenizers 0.13.3
torch 2.0.1
torchaudio 2.0.2+cu117
torchvision 0.15.2+cu117
tqdm 4.65.0
transformers 4.30.2
typing_extensions 4.7.1
urllib3 2.0.3
Werkzeug 2.3.6
wheel 0.40.0
yarl 1.9.2
On another computer it works:
λ semantra --version
0.1.7
λ pipx runpip semantra list
Package Version
------------------ --------
aiohttp 3.8.4
aiosignal 1.3.1
annoy-fixed 1.16.3
async-timeout 4.0.2
attrs 23.1.0
blinker 1.6.2
certifi 2023.5.7
charset-normalizer 3.1.0
click 8.1.3
colorama 0.4.6
filelock 3.12.2
Flask 2.3.2
frozenlist 1.3.3
fsspec 2023.6.0
huggingface-hub 0.16.2
idna 3.4
importlib-metadata 6.7.0
itsdangerous 2.1.2
Jinja2 3.1.2
MarkupSafe 2.1.3
mpmath 1.3.0
multidict 6.0.4
networkx 3.1
numpy 1.25.0
openai 0.27.8
packaging 23.1
Pillow 10.0.0
pip 23.2
pypdfium2 4.18.0
python-dotenv 1.0.0
PyYAML 6.0
regex 2023.6.3
requests 2.31.0
safetensors 0.3.1
semantra 0.1.7
setuptools 68.0.0
sympy 1.12
tiktoken 0.4.0
tokenizers 0.13.3
torch 2.0.1
tqdm 4.65.0
transformers 4.30.2
typing_extensions 4.7.1
urllib3 2.0.3
Werkzeug 2.3.6
wheel 0.40.0
yarl 1.9.2
zipp 3.15.0
Working computer has importlib-metadata 6.7.0
and zipp 3.15.0
and does not have torchaudio 2.0.2+cu117
or torchvision 0.15.2+cu117
(from https://github.com/freedmand/semantra/issues/36#issuecomment-1624544172)
Otherwise they are the same.