jan icon indicating copy to clipboard operation
jan copied to clipboard

bug: doc retrieval and embedding broken

Open Propheticus opened this issue 10 months ago • 3 comments

Describe the bug Since installing v0.4.9-343 I can no longer attach documents in the chat if it's not the first message in a thread. Also there can be no assistant instructions.

Steps to reproduce Steps to reproduce the behaviour:

  1. Open a new thread
  2. Say Hi
  3. Now attach a doc and ask for a summary
  4. "Apologies something's amiss!" /or/
  5. Open a new thread
  6. Use the assistant instructions field, e.g. "you're allowed to be slightly sarcastic"
  7. In the first message attach a doc and ask for a summary
  8. "Apologies something's amiss!"

Expected behaviour I expect the doc to be embedded and a response to be generated using it as context.

Screenshots image

Environment details

  • Operating System: Win 11 Pro N x64
  • Jan Version: 0.4.9-343
  • Processor: Ryzen 7 7700X
  • RAM: 32GB
  • GPU: AMD RX 6800XT 16GB

Logs

2024-03-26T12:24:05.206Z [NITRO]::Debug: 20240326 12:17:14.343000 UTC 16136 INFO  Here is the result:0 - llamaCPP.cc:420
20240326 12:24:05.205000 UTC 4816 INFO  Clean cache threshold reached! - llamaCPP.cc:192
20240326 12:24:05.205000 UTC 4816 INFO  Cache cleaned - llamaCPP.cc:194
20240326 12:24:05.205000 UTC 4816 ERROR Unhandled exception in /inferences/llamacpp/chat_completion, what(): Type is not convertible to string - HttpAppFrameworkImpl.cc:124

2024-03-26T12:24:06.212Z [NITRO]::Debug: 20240326 12:24:06.210000 UTC 4816 INFO  sent the non stream, waiting for respone - llamaCPP.cc:416
[1711455846] [D:\a\nitro\nitro\controllers\llamaCPP.h:  882][llama_server_context::launch_slot_with_data] slot 0 is processing [task id: 5]

2024-03-26T12:24:06.212Z [NITRO]::Debug: [1711455846] [D:\a\nitro\nitro\controllers\llamaCPP.h: 1722][llama_server_context::update_slots] slot 0 : kv cache rm - [0, end)

2024-03-26T12:24:06.617Z [NITRO]::Debug: [1711455846] [D:\a\nitro\nitro\controllers\llamaCPP.h:  475][llama_client_slot::print_timings] 
[1711455846] [D:\a\nitro\nitro\controllers\llamaCPP.h:  480][llama_client_slot::print_timings] print_timings: prompt eval time =     332.33 ms /    65 tokens (    5.11 ms per token,   195.59 tokens per second)
[1711455846] [D:\a\nitro\nitro\controllers\llamaCPP.h:  485][llama_client_slot::print_timings] print_timings:        eval time =      72.45 ms /     6 runs   (   12.07 ms per token,    82.82 tokens per second)
[1711455846] [D:\a\nitro\nitro\controllers\llamaCPP.h:  487][llama_client_slot::print_timings] print_timings:       total time =     404.77 ms
[1711455846] [D:\a\nitro\nitro\controllers\llamaCPP.h: 1585][llama_server_context::update_slots] slot 0 released (72 tokens in cache)

Additional context Changing the context length back to default 4096 (from 32k or 16k) does not fix. Tested using Mistral 7B instruct v0.2 Q5_K_M

Adding context mid-chat using the API endpoint (from Anything LLM) works.

Propheticus avatar Mar 26 '24 12:03 Propheticus

Thank you, the issue is reproducible on both Windows and MacOS. We will resolve it ASAP. @louis-jan

Van-QA avatar Mar 26 '24 13:03 Van-QA

hi @Propheticus, can you try again using our latest nightly 🙏 Jan v0.4.9-345

Van-QA avatar Mar 27 '24 08:03 Van-QA

Thanks @Van-QA . Appears to have been fixed 👍. Still some quirks but will report those separately.

Propheticus avatar Mar 27 '24 18:03 Propheticus