FNsi

Results 68 comments of FNsi

Thanks for doing that! I successfully quantised 30B to 4_1 size, here's my test. > apx run ./main -m models/30B/ggml-model-q4_1.bin -n 4096 --temp 0.7 --top_p 0.5 --repeat_penalty 1.17647 -c 4096...

Sorry I just find out the problem I made. I already changed the line to 4_1 bin, and resubmit the current response. > Thank you! > > > > @FNsi,...

Sorry but do some one know how to merge the lora to the raw model?

I find out sometimes ai think they chat in a forum, so the user can be kick out the chat😂

Seems the flag n makes different. n=100000 no long work🤔. But After change n to 4096 also not working...

> Seems the flag n makes different. > n=100000 no long work🤔. > After change n to 4096 also not working. Rollback to my back up...

> some of the last commits changed/touched how memory is handled. > > also there is `-c` you can set up to 2048. Always. -c works fine even with 5000+...

> Kinda similar case here, although im unsure which specific commit caused performance loss > > > > After swapping out old exe for new one, I went from 207...

> @FNsi please try again with latest master. Ban the Blas make it works, still have little performance loss anyway. a guess from me, is it because the blas try...

> It's not the 4-bits - it does not work with F16 either. > > I am almost sure that this F16 BLAS call is somehow wrong: > > >...