jan icon indicating copy to clipboard operation
jan copied to clipboard

bug: engine settings are not being loaded from the model.json file

Open louis-jan opened this issue 2 months ago • 3 comments

Describe the bug

The nitro-extension, which runs the nitro engine within the application, currently defines the core number as equal to the number of physical cores to avoid resource hogging. However, this should definitely be overridden by the model.json settings.

platform info: 
ubuntu 22 lts
i3 10100f (4c8t)
rx550 2gb
32gb ram
jan 0.4.1 .deb dpkg install

issue:
unable to configure cpu threads allocated to nitro

details:
there is no nitro.json file in jan/engines, the documentation suggests there should be one. making on manually and specifying cpu_threads in it has no effect (ngl and cpu_threads didnt apply per logs and activity monitor)
some relevant log excerpts:

2024-04-11T09:42:36.329Z [NITRO]::Debug: Loading model with params {"ctx_len":4096,"prompt_template":"<start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model","llama_model_path":"/alx_thor_pool_1/AI/jan/models/gemma-7b/gemma-7b-it-q4_K_M.gguf","user_prompt":"<start_of_turn>user\n","ai_prompt":"<end_of_turn>\n<start_of_turn>model","cpu_threads":4,"ngl":100}

{"timestamp":1712828556,"level":"INFO","function":"loadModelImpl","line":637,"message":"system info","n_threads":4,"total_threads":8,"system_info":"AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | "}

Running nitro manually and specifying cpu_threads=8 when loading the model makes nitro use all 8 as intended:

{"timestamp":1712829316,"level":"INFO","function":"loadModelImpl","line":637,"message":"system info","n_threads":8,"total_threads":8,"system_info":"AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | "}

https://discord.com/channels/1107178041848909847/1227922266684395520/1227922266684395520

Expected behavior The nitro extension should respect and utilize the override value provided by the model.json file.

louis-jan avatar Apr 11 '24 10:04 louis-jan

The issue also appears to be present on Mac. I tried loading Command R Plus and tweaking the penalty-value, but the logs suggests it always loads with "repeat_penalty":1.0 despite the model.json being edited with values ranging from 0-0.1. The value is reflected correctly in Jans GUI, though.

vlbosch avatar Apr 11 '24 12:04 vlbosch

Correction, it appears that the value "repeat_penalty":1.0 always defaults to 1.0 and the "frequency_penalty" is used separately and being reflected correctly. How can I change the default "repeat_penalty"-value or disable the penalty altogether?

Besides that, I did notice that the temperature that the models is being loaded with, differs from the value in the model.json. It loads with 0.800000011920929, also when changing the value in de json-file. I also noted that the "Inference Parameters" in the GUI are greyed-out and cannot be changed anymore? Scherm­afbeelding 2024-04-11 om 14 36 18

vlbosch avatar Apr 11 '24 12:04 vlbosch

@vlbosch See the same for the model initialisation, but every prompt you make it passes the temperature in the call it makes. Tested by setting temp to 0 : fully deterministic --> every time you delete the answer and rerun the question, the answer is identical.

Propheticus avatar Apr 11 '24 14:04 Propheticus

hi @Propheticus, Would you mind double checking after the fix from dev team, using Jan Jan v0.4.11-394? many thank 🙏

Van-QA avatar Apr 24 '24 16:04 Van-QA

@Van-QA It works and respects the cpu_threads value now. It defaulted to 9 (I have 8 physical cores), and was able to override that via the model.json

The v0.4.11 and all nightlies are still broken for me though. Reply 3 or 4 looks like: image Rolling back to v0.4.10 again 😢

Propheticus avatar Apr 24 '24 19:04 Propheticus