ManniX-ITA comments

Results 87 comments of


                                            ManniX-ITA

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

They will be supported in the future, not sure when. There's not a huge interest because the i-matrix quants are sensibly slower during inference. And it takes a lot of...

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

To be honest anything below Q4 is poor quality, better to pick a smaller model. There are other formats better suited for 2/3 bit than GGUF with 3 bit very...

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

> * As far as I know, IQ quants are not the same thing as i-matrix quants, which can apply to any of the other quants, like K quants. I...

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

> I'm done arguing with you, "for obvious reasons." I'm done too arguing, there's really no obvious reason why you should attack me or defend @sammcj... Weird! But thanks for...

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

Made a PR to support the latest IQ formats: https://github.com/ollama/ollama/pull/3657 **IQ4_NL is now fixed.** They work pretty nice for me but only on the GPU. Definitely not recommended running on...

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

The enum order doesn't matter, the type is being checked over the tensors `t.Kind`. And it didn't mess up my massive library so don't worry :P ```go func (t Tensor)...

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

> So it's definitely not stored anywhere in Ollama's metadata files (that was my main worry)? Definitely not, the file is parsed every time it's loaded.

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

> Like I said, I defended _his point_. Thanks for the PR. Are you giving up on IQ4_NL? Should someone else look into it? Let it go, I don't mind...

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic

I have updated the PR to fix IQ4_NL support, I will add the benchmark to the table above

Streamlined server startup and fixed model loading status

@dhiltgen Would you look at this?