llamafile icon indicating copy to clipboard operation
llamafile copied to clipboard

Security

Open cat-state opened this issue 1 year ago • 7 comments

Llamafile is a great convenience by bundling the inference code with the weights. However, it offers less security to users in its current form than the use of untrusted safetensors/gguf weights + seperately downloaded (trusted) model impl. If llamafile takes off, users will be executing random executables downloaded from HF generated by random people, presenting a security hole through people including backdoored llama.cpp implementations.

What would be the best way to address these security concerns?

  • Ultimately, if you bundle the inference code and weights, there is no way to verify anything (since the verifier must be bundled too..), so maybe just educating users is all one can do.
  • Huggingface could check if .llamafiles are generated correctly, by introspecting the binary to see its just llama.cpp + the weights, but this would be overhead on keeping up with llama.cpp updates. It could also provide some autoconversion from gguf to llamafile, ensuring the security of the output.
  • One could add norms for verifying signatures of downloaded llamafiles (e.g making the most common copy-pasted commands verify signatures against some PKI), or include a package manager, but the former allows attacks via mis-authentication and the latter via not being the "most convenient route".
  • The most secure but least convenient outcome would be to have norms of llama.cpp distributed as an APE and have the .gguf downloaded seperately, replacing instances of wget huggingface.com/.../xyz.llamafile && ./xyz.llamafile with curl llamaup.com/up | bash and llamafile huggingface.com/.../xyz.gguf, like the "seperately downloaded llamafile server" example in the docs.

cat-state avatar Nov 30 '23 12:11 cat-state

Good point @cat-state. We may go even further by answering the question: who is the audience of this project?

I see three groups:

  1. LLM researchers who want to share models in a convenient way
  2. privacy-oriented users who don't want to pass their data to external providers
  3. geeks who want to check a new trending thing.

The first group may be mostly interested in the verification of authorship (Does this binary really come from a trusted researcher?). The second group wants to be aware that they are trading security for privacy (external REST-like API cannot execute arbitrary code). The last group may be interested in not installing ransomware while chatting with AI.

I think the easiest workaround is to put the executable inside a Docker container, but I'm unsure about the details.

macie avatar Nov 30 '23 20:11 macie

Thank you for this feedback! We agree this is a concern. We’ve personally validated and can vouch for the example wrapped weights we provide with this project. Our plan next (soon!) is to add a tool that users can use to validate that their llamafile’s llama.cpp bits are from a source you trust. There is also functionality in Cosmopolitan (pledge(), specifically) that we are considering using to further secure things. We will update this issue soon with more details.

stlhood avatar Nov 30 '23 22:11 stlhood

Just adding to this... the current llamafile-server-0.2.1 exe file is showing up as a malicious file when scanned with VirusTotal. I would like to try it out... but wont trust it until this is cleared up.

https://www.virustotal.com/gui/file/2b3c692e50d903cbf6ac3d8908f8394101b5be5f8a4573b472975fa8c9f09e68

geoffsmith82 avatar Dec 04 '23 08:12 geoffsmith82

@geoffsmith82 Only 7 out of 70 AVs are reporting false positives. That's actually a pretty good score. Please consider that llamafile uses a novel polyglot executable format we designed in order to help more people run LLMs on their personal computers. If you need to wait for a consensus to emerge in the AV industry that polyglot file formats are good, before you can try llamafile, then you could be waiting for quite some time. We're happy to file reports with AVs that have submission forms like Microsoft, so please tell us if you ever have issues with Windows Defender (which is green in the link you provided) but I can't offer you any assurances regarding the other 69.

jart avatar Dec 04 '23 10:12 jart

FYI, AVG reported llamafile-server-0.3 as infected with FileRepMalware (even though AVG wasn't detected in the above report).

I sent a false positive report to them with a link to this issue and a reference to the polyglot file format.

JoshuaCWebDeveloper avatar Dec 12 '23 22:12 JoshuaCWebDeveloper

@JoshuaCWebDeveloper If AVG white-labels Windows Defender then try updating to the latest malware definitions. I submitted the 0.3 release to Microsoft Security Intelligence earlier today (see https://www.microsoft.com/en-us/wdsi/submission/a81e9778-a046-46c9-8221-ed18ede17850) due to a Windows Defender issue. The ticket got resolved. Therefore I believe AV issues with 0.3 to be resolved.

jart avatar Dec 13 '23 01:12 jart

@cat-state hows this ticket, have your question been solved?

mofosyne avatar May 29 '24 09:05 mofosyne