ManniX-ITA
ManniX-ITA
I have updated it to fix IQ4_NL
> This is great. Any specific reason IQ3_M isn't included? I think there's still something wrong, you can quantize it but not run it with the main release. If you...
@sammcj Do you have to approve it again?
Sent a PR to fix the startup issues, if you can please test it https://github.com/ollama/ollama/pull/3702
@BruceMacD Any idea when we can get this merged?
will quantize it for sure but it's only 4k context for now, they will release later a new version if you see it on HF and I miss it, ping...
@BruceMacD I'm a bit overloaded lately, if you can do the merge I'd really appreciate it! Thanks
@dhiltgen perfect, thanks!
> In any cases, it will be more logical to call `ctx_server.load_model(params)` only after all endpoints are registered. Additionally, we can add a middleware to throw 503 if the model...
> Furthermore, this change requires main thread to call `svr` to register new endpoints after it is spawned into new thread. This will make `svr` not thread-safe. You are right...