ManniX-ITA

Results 87 comments of ManniX-ITA

> This is great. Any specific reason IQ3_M isn't included? I think there's still something wrong, you can quantize it but not run it with the main release. If you...

Sent a PR to fix the startup issues, if you can please test it https://github.com/ollama/ollama/pull/3702

@BruceMacD Any idea when we can get this merged?

will quantize it for sure but it's only 4k context for now, they will release later a new version if you see it on HF and I miss it, ping...

@BruceMacD I'm a bit overloaded lately, if you can do the merge I'd really appreciate it! Thanks

> In any cases, it will be more logical to call `ctx_server.load_model(params)` only after all endpoints are registered. Additionally, we can add a middleware to throw 503 if the model...

> Furthermore, this change requires main thread to call `svr` to register new endpoints after it is spawned into new thread. This will make `svr` not thread-safe. You are right...