RazeLighter777 comments

Results 10 comments of


                                            RazeLighter777

[feature/bug] Implement (hardware)transcoder failover/prioritisation

It would be interesting if you could implement a customizable priority list of transcoding methods. When ffmpeg returns an error, it simply uses the next highest transcoding method on the...

[feature/bug] Implement (hardware)transcoder failover/prioritisation

Note failovers should definitely be logged or made known to the admin somehow, as they might lead some users to think their hardware transcoding is working when it really is...

added ctx_size parameter

Not 100% sure if this is right but I added it.

Support for CPU-Only Systems with limited RAM

Which models are you using? The llama models appear to work only on GPU for me, despite having 48GB of RAM

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`

Fixed this by reducing the VRAM with the `--gpu-memory` flag by one gigabyte.

Limit CPU RAM usage when offloading model to GPU

Did you try the `--cpu-memory` flag?

Add tabs to make the interface more like AUTOMATIC1111

@oobabooga Maybe differentiate between frontend and backend extensions? Seems like any model should be able to work with chat GUI. Some extensions then would be mutually exclusive (you probably couldn't...

Add avx-512 support?

This would require code changes, not just compiler flags.

Please, document how to get the model

There's also a torrent for the pre quantized version. Not gonna link it here but you can find it pretty easily @leiserfg