Akarshan Biswas comments

Results 101 comments of


                                            Akarshan Biswas

server : (experimental) vision support via libmtmd

@ngxson ~~Can you please refresh this branch with master?~~ Nvm. Ended up using your fork .. ~~working great!!!~~ 👍 On further testing, it seems that llama_batch_size exceeds sometimes in successive...

Significant performance difference of NNX relative to equinox

Just to point out here that during nnx models training, the overall GPU usage do not cross above 20%. Will splitting into graphdef and state improve the performance?

bug: Jan stops any instance of llama-server upon starting

Fixed in 0.6.6 with dbdc03158300dea06c1f3ff025fe4c7ceff66969 and subsequent commits. Now the backend manages its own llama-server state.

bug: AppImage of 0.7.1 does not starts in RHEL 9.6

This is an appimage bundling problem. The only way to fix it is to either build from source it on RHEL 9.6 or wait for flatpak package. Unfortunately flatpak package...

Misc. bug: All llama executables exit immediately without console output

What happens when you append `--log-verbose` to llama-server? Unfortunately, I don't have a windows machine to test.

Misc. bug: All llama executables exit immediately without console output

If you are able to share a stack trace, it would be very helpful. The stack trace would allow us to pinpoint the issue.

Feature Request: Proper Llama 3.1 Support in llama.cpp

Also, adding to this, a proper function calling support in the server since llama 3.1 now supports tooling/function calling.

Feature Request: Proper Llama 3.1 Support in llama.cpp

> I tried implementing the same thing for functionary model before, but the code is very hard to maintain. ~~Can you point me to that commit?~~ Edit: @ngxson Got the...

roadmap: Decouple cortex.cpp from Jan

Moving to Tauri also has an opportunity to directly integrate llama.cpp Rust bindings into Jan for the llama.cpp provider extension. Doing so would: * **Provide enhanced hardware insights within Jan**,...

idea: MLX support

For now I think we should support backends supported by ggml for local inference. MLX supporting openai servers can already be supported as an external provider.