Justine Tunney

Results 533 comments of Justine Tunney

Sorry I didn't notice this until now! Don't be shy to ping me at [email protected] if I ever fall behind. I'll review this as soon as possible. Promise :)

Ok I took a look finally. If we can maintain backwards compatibility with what I have installed on my system (Ubuntu 14.04) then I'm happy.

I'm so sorry, but when you get around to updating PR, you'll get a conflict rebasing this on master. I just migrated the project to GNU autotools. (That included moving...

If NASA engineers end up noticing a comment like that, from a scrappy little telecom library like ours, they'll probably take it as a complement!

I support this. Are you a Debian maintainer who would be able to be of assistance?

Try comparing `./mistral-7b-instruct-v0.2.Q5_K_M.llamafile --version` with `llamafile --version`. If they're the same, then they *should* behave identically. You can also use `unzip` to extract the gguf file from the llamafile and...

Ragel is only a dependency if you install from git. If you download the tarball, you don't need it. I'd be happy to accept spec file.

Is your program using `MAP_FIXED`? You can use `blink -s` to system call trace and find out. Chances are it's requesting fixed memory that overlaps with memory Android OS or...

@USBhost Unfortunately no. The K quants were designed to exploit under-utilization of CPU resources when doing matvecs. I tried copying and pasting the `Q5_K_M` code into a tinyBLAS 2-d block-tiling...

The tinyBLAS code upstreamed by Mozilla's llamafile project makes prompt processing go very fast for F32, F16, Q4\_0, and Q8\_0. | model | size | params | backend | threads...