LLMFarm
LLMFarm copied to clipboard
llama and other large language models on iOS and MacOS offline using GGML library.
LLMFarm
LLMFarm is an iOS and MacOS app to work with large language models (LLM). It allows you to load different LLMs with certain parameters.With LLMFarm, you can test the performance of different LLMs on iOS and macOS and find the most suitable model for your project.
Based on ggml and llama.cpp by Georgi Gerganov.
Also used sources from:
- rwkv.cpp by saharNooby
- Mia by byroneverson
- LlamaChat by alexrozanski
Features
- [x] MacOS (13+)
- [x] iOS (16+)
- [x] Various inferences
- [x] Various sampling methods
- [x] Metal (dont work on intel Mac)
- [x] Model setting templates
- [x] LoRA adapters support
- [x] LoRA finetune support
- [x] LoRA export as model support
- [x] Restore context state
- [x] Apple Shortcuts
Inferences
- [x]
LLaMA 1,2,3 - [x]
Gemma - [x]
Phi models - [x]
GPT2 + Cerebras - [x]
Starcoder(Santacoder) - [x]
Falcon - [x]
MPT - [x]
Bloom - [x]
StableLM-3b-4e1t - [x]
Qwen - [x]
Yi models - [x]
Deepseek models - [x]
Mixtral MoE - [x]
PLaMo-13B - [x]
Mamba - [x] RWKV (20B tokenizer)
- [x] GPTNeoX
- [x] Replit
Multimodal
- [x] LLaVA 1.5 models, LLaVA 1.6 models
- [x] BakLLaVA
- [x] Obsidian
- [x] ShareGPT4V
- [x] MobileVLM 1.7B/3B models
- [x] Yi-VL
- [x] Moondream
Note: For Falcon, Alpaca, GPT4All, Chinese LLaMA / Alpaca and Chinese LLaMA-2 / Alpaca-2, Vigogne (French), Vicuna, Koala, OpenBuddy (Multilingual), Pygmalion/Metharme, WizardLM, Baichuan 1 & 2 + derivations, Aquila 1 & 2, Mistral AI v0.1, Refact, Persimmon 8B, MPT, Bloom select llama inferece in model settings.
Sampling methods
- [x] Temperature (temp, tok-k, top-p)
- [x] Tail Free Sampling (TFS)
- [x] Locally Typical Sampling
- [x] Mirostat
- [x] Greedy
- [x] Grammar (dont work for GPTNeoX, GPT-2, RWKV)
- [ ] Classifier-Free Guidance
Getting Started
You can find answers to some questions in the FAQ section.
Inference options
When creating a chat, a JSON file is generated in which you can specify additional inference options. The chat files are located in the "chats" directory. You can see all inference options here.
Models
You can download some of the supported models here.
Development
llmfarm_core has been moved to a separate repository. To build llmfarm, you need to clone this repository recursively:
git clone --recurse-submodules https://github.com/guinmoon/LLMFarm

