jan
jan copied to clipboard
feat: enable `continue` button for when LLM responses exceed token limit
Describe the bug LLM responses seem to terminate prematurely, do we have a max reply token setting somewhere? EDIT: Mistral likely limited to 512?
To Reproduce Steps to reproduce the behavior:
- Use Mistral 7b instruct
- Ask Mistral 7b to generate 100 names similar to "Thinking Machines"
- Mistral will generate 63 or so
- Ask Mistral to continue
- Mistral generates the remaining 100
Expected behavior Mistral should generate the full 100
Screenshots
Desktop (please complete the following information):
- OS: Mac M2 64GB RAM
- Browser [e.g. chrome, safari]
- Version v0.2.0
Additional context Add any other context about the problem here.
@dan-jan this might be nitro token limitation that we hardcoded at some point @tikikun ?
This is not a bug since i don't have any hard coded token limit
should be transfered to jan
@vuonghoainam I saw your latest update already define context length = 2048 {ctx_len: 2048}, which seem pretty good. But the issue still exists. Let's investigate this somewhere this week
@imtuyethan can you check if this issue still exists? Closing if not
Just checked, still happens on my end.
cc @vuonghoainam @louis-jan @0xSage
https://github.com/janhq/jan/assets/89722390/a398fc78-461d-48e8-a75a-3e3d7abfe7fb
I will check this
The way this one should work is that user can click on the button to keep generating more. The default behavior of LLM is to stop once the token limit reaches. See ChatGPT
UI
Same result with other testing tools. Good suggestion @vuonghoainam. Do you know the indicator for displaying this button? Specifically, how can we determine from the response when the message has reached the token limit (count the tokens)?
Converted this to a feature request
@louis-jan I'm going to shift this to the "Jan as the Default Assistant", and we'll re-file issue as a "Continue" button.
Closed this as it's deprecated, we have max token
in threads settings now.