terafo comments

Results 9 comments of


                                            terafo

Strange performance with specific repo

@LovecraftianHorror can't reproduce on ```master``` branch. Not sure if I should close this issue since it isn't fixed in released version.

What models i really need?

Bigger model - generally better results, but uses more ram and slower. If you want fast results - 7B, best results - use whatever fits in your RAM.

Strange behaviour and answer

You would need GPU with tens of gigs of VRAM and use another fork.

Strange behaviour and answer

llama.cpp is made only for inference, it doesn't have training functionality. It wouldn't make sense to do that on CPU for model of that size anyways. Meta didn't release LLaMa...

Fixed issues of dot product with 1D tensors

Found some holes in my fix that weren't in original code, dot product still needs fixing.

Fixed issues of dot product with 1D tensors

Actually fixed that, a bit messy, but works with everything I threw on it

8-bit quantization for llama

Theoretically it can be done in runtime and not in the separate file, but that would take a lot of time, more RAM and would require loading of full set...

I'm making progress towards generic vector support in a separate [branch](https://github.com/terafo/tinygrad/tree/vector), since it requires changing too much stuff that isn't related to hexagon. I'll open a separate draft PR for...

Snap package is severly out of date.

Just a friendly reminder that it's still outdated.