SolsticeProjekt

Results 5 comments of SolsticeProjekt

While I was trying to figure out how to convert a small pytorch based model to ggml, I've found this thread. I wanted to emphasize that *small* models (sub 1gig)...

> Great to hear that the v2 is an improvement. For my usecase, the main metric I care about is time to first token. What does that look like for...

> [Here's one](https://huggingface.co/iambestfeed/open_llama_3b_4bit_128g). It's the one the results in the readme are based on. Seems to work alright. Thanks. This is the result of test_benchmark_inference using "-p -ppl": notebook, 5900HS,...

> @SolsticeProjekt > > https://huggingface.co/SinanAkkoyun/orca_mini_3b_gptq_badtest :) > > This is for actual chatting and not a base model. I quantized it myself, that's why it's called badtest, although it performs...

> > I'm trying to figure out how to quantize models myself > > Basically, install AutoGPT and look at my model README, you can quantize them with the other...