jan feat: enable `continue` button for when LLM responses exceed token limit

feat: enable `continue` button for when LLM responses exceed token limit

Open dan-menlo opened this issue 1 year ago • 11 comments

Describe the bug LLM responses seem to terminate prematurely, do we have a max reply token setting somewhere? EDIT: Mistral likely limited to 512?

To Reproduce Steps to reproduce the behavior:

Use Mistral 7b instruct
Ask Mistral 7b to generate 100 names similar to "Thinking Machines"
Mistral will generate 63 or so
Ask Mistral to continue
Mistral generates the remaining 100

Expected behavior Mistral should generate the full 100

Screenshots

Desktop (please complete the following information):

OS: Mac M2 64GB RAM
Browser [e.g. chrome, safari]
Version v0.2.0

Additional context Add any other context about the problem here.

Oct 19 '23 14:10 dan-menlo

@dan-jan this might be nitro token limitation that we hardcoded at some point @tikikun ?

Oct 20 '23 07:10 freelerobot

This is not a bug since i don't have any hard coded token limit

Oct 20 '23 10:10 tikikun

should be transfered to jan

Oct 20 '23 10:10 tikikun

@vuonghoainam I saw your latest update already define context length = 2048 {ctx_len: 2048}, which seem pretty good. But the issue still exists. Let's investigate this somewhere this week

Oct 30 '23 13:10 louis-jan

@imtuyethan can you check if this issue still exists? Closing if not

Nov 13 '23 07:11 freelerobot

Just checked, still happens on my end.

cc @vuonghoainam @louis-jan @0xSage

https://github.com/janhq/jan/assets/89722390/a398fc78-461d-48e8-a75a-3e3d7abfe7fb

Nov 13 '23 07:11 imtuyethan

I will check this

Nov 20 '23 02:11 louis-jan

The way this one should work is that user can click on the button to keep generating more. The default behavior of LLM is to stop once the token limit reaches. See ChatGPT UI

Nov 20 '23 16:11 hiro-v

Same result with other testing tools. Good suggestion @vuonghoainam. Do you know the indicator for displaying this button? Specifically, how can we determine from the response when the message has reached the token limit (count the tokens)?

Nov 21 '23 09:11 louis-jan

Converted this to a feature request

Nov 23 '23 07:11 louis-jan

@louis-jan I'm going to shift this to the "Jan as the Default Assistant", and we'll re-file issue as a "Continue" button.

Dec 11 '23 14:12 dan-menlo

Closed this as it's deprecated, we have max token in threads settings now.

Jan 12 '24 09:01 imtuyethan

jan jan copied to clipboard

feat: enable `continue` button for when LLM responses exceed token limit

jan
jan copied to clipboard