Justine Tunney
Justine Tunney
Thanks for your patience! I've got the BF16 fix in for you. It'll be rolled out in the release. Whenever you need *anything* merged please do take the time to...
Also if anyone has any ideas on how we might go about solving the issue properly, by implementing Apple Metal GPU support for BF16, I'm willing to take a crack...
It looks like multiple device support regressed during the last llama.cpp upgrade. You can work around this by setting `export CUDA_VISIBLE_DEVICES=1` before running llamafile. You can also get dual gpu...
It'd be nice to have an easier way to generate cat photos on the command line. One project we could use is https://github.com/leejet/stable-diffusion.cpp They appear to depend on GGML but...
What binary did you run? What was your command line invocation?
Are you using the linear memory optimization? It should be enabled on most platforms by default, unless you're disabling it by passing the '-m' flag. If I'm running on Linux...
Have you read these sections of the readme? - https://github.com/jart/blink#virtualization - https://www.wired.com/story/apple-csam-scanning-heat-initiative-letter/ The reason why `-m` is costly is because it does full memory virtualization. It has to indirect memory...
You're also invited to join our Discord https://discord.gg/Hb4QHYj2
Assigning gradient issue to @girving
Have you tried using the latest version? 0.6.2 is from a very long time ago.