RazeLighter777

Results 10 comments of RazeLighter777

It would be interesting if you could implement a customizable priority list of transcoding methods. When ffmpeg returns an error, it simply uses the next highest transcoding method on the...

Note failovers should definitely be logged or made known to the admin somehow, as they might lead some users to think their hardware transcoding is working when it really is...

Usage `-c N, --ctx_size N`

Not 100% sure if this is right but I added it.

Which models are you using? The llama models appear to work only on GPU for me, despite having 48GB of RAM

Fixed this by reducing the VRAM with the `--gpu-memory` flag by one gigabyte.

Did you try the `--cpu-memory` flag?

@oobabooga Maybe differentiate between frontend and backend extensions? Seems like any model should be able to work with chat GUI. Some extensions then would be mutually exclusive (you probably couldn't...

This would require code changes, not just compiler flags.

There's also a torrent for the pre quantized version. Not gonna link it here but you can find it pretty easily @leiserfg