vim-ai
vim-ai copied to clipboard
Allow dynamic setting of max_tokens based on input text
During completion or editing, an error can reliably occur when max_tokens + [length of selected text] + [length of instruction given] exceeds the model's content length limit, even if the model would have generated very short completion.
For example, a setting of max_tokens in my config to 4000 with text-davinci-003 will error on AIE when my selected text is realistically long (say, 100 tokens). To avoid this, I have to write constraining configuration that assumes the worst-case (it has been reasonable to assume i will never insert a long text and request back a longer text, so 1/2 the model's capacity does OK.)
However it would be nice to be able to set like "dynamic_max_tokens": { "enabled": 1, "effective_max": 4096 }} and have the plugin do some math to interpolate an appropriate max_tokens based on how much is leftover after accounting for your pasted text. This behavior would also make room for smarter error handling in cases where the selected text is close to or over the max length.
I would be happy to work on a PR for this!
At a minimum I hope this issue will help others understand why they are getting this somewhat surprising error when running with ambitiously chosen max_tokens:
Error detected while processing function vim_ai#AIEditRun[13]..provider#python3#Call:
line 18:
Error invoking 'python_execute_file' on channel 8 (python3-script-host):
Traceback (most recent call last):
File "/Users/eve/.vim/plugged/vim-ai/py/complete.py", line 55, in <module>
handle_completion_error(error)
File "/Users/eve/.vim/plugged/vim-ai/py/utils.py", line 163, in handle_completion_error
raise error
File "/Users/eve/.vim/plugged/vim-ai/py/complete.py", line 52, in <module>
render_text_chunks(text_chunks)
File "/Users/eve/.vim/plugged/vim-ai/py/utils.py", line 53, in render_text_chunks
print_info_message('Empty response received. Tip: You can try modifying the prompt and retry.')
File "/Users/eve/.vim/plugged/vim-ai/py/utils.py", line 140, in print_info_message
vim.command(f"normal \<Esc>")
File "/usr/local/lib/python3.11/site-packages/pynvim/api/nvim.py", line 287, in command
return self.request('nvim_command', string, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pynvim/api/nvim.py", line 182, in request
res = self._session.request(name, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pynvim/msgpack_rpc/session.py", line 102, in request
raise self.error_wrapper(err)
pynvim.api.common.NvimError: Vim(normal):E21: Cannot make changes, 'modifiable' is off
Thanks for the detailed explanation! I think "effective max" is a good idea. It is however quite challenging to count tokens with a bare python without using any external deps/libs.
Maybe for the start it would help make max_tokens optional as suggested in this issue #42
Maybe for counting tokens it would be just fine to use a simple approximation like 1 token ~= ¾ words as described here: https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
Also it might make sense to turn on "effective max" only when the max-tokens + prompt exceeds model's token limit.
For some reason I believed this issue is not the same as #42 but i realize now that person is probably hitting the same problem given the numbers they are quoting.
I wonder what is the behavior when max_tokens is not specified -- does it assume the allowed max given prompt length, or prompt length + something specific? That would be convenient but is undocumented.
In any case I will take a crack at setting this optional sometime this month