Justine Tunney comments

Results 655 comments of


                                            Justine Tunney

Feature Request: Automate update of upstream llama.cpp

Thanks for your patience! I've got the BF16 fix in for you. It'll be rolled out in the release. Whenever you need *anything* merged please do take the time to...

Feature Request: Automate update of upstream llama.cpp

Also if anyone has any ideas on how we might go about solving the issue properly, by implementing Apple Metal GPU support for BF16, I'm willing to take a crack...

[`Ubuntu 22.04` with 2x `NVIDIA 4090`] Text generation fails with `--gpu nvidia` flag

It looks like multiple device support regressed during the last llama.cpp upgrade. You can work around this by setting `export CUDA_VISIBLE_DEVICES=1` before running llamafile. You can also get dual gpu...

Support for Stable Diffusion image generators

It'd be nice to have an easier way to generate cat photos on the command line. One project we could use is https://github.com/leejet/stable-diffusion.cpp They appear to depend on GGML but...

Bug: ^C doesn't stop model

What binary did you run? What was your command line invocation?

Lots of time spent in memory subsystem

Are you using the linear memory optimization? It should be enabled on most platforms by default, unless you're disabling it by passing the '-m' flag. If I'm running on Linux...

Lots of time spent in memory subsystem

Have you read these sections of the readme? - https://github.com/jart/blink#virtualization - https://www.wired.com/story/apple-csam-scanning-heat-initiative-letter/ The reason why `-m` is costly is because it does full memory virtualization. It has to indirect memory...

Justine Tunney

Feature Request: Automate update of upstream llama.cpp

Feature Request: Automate update of upstream llama.cpp

[`Ubuntu 22.04` with 2x `NVIDIA 4090`] Text generation fails with `--gpu nvidia` flag

Support for Stable Diffusion image generators

Bug: ^C doesn't stop model

Lots of time spent in memory subsystem

Lots of time spent in memory subsystem

Lots of time spent in memory subsystem

Numerical instability of gradient calculation of tf.norm (nan at 0, inf for small values)

Bug: --gpu option cannot work on win10, not friendly to WIN.