mental-diffusion
mental-diffusion copied to clipboard
Fast Stable Diffusion CLI +Gradio
Mental Diffusion
Stable diffusion command-line interface
Powered by Diffusers
Version 0.7.5 alpha
Torch 2.2.2 +cu121

- Command-line interface
- Websockets server
- Websockets client
Electron
- SD 1.5, SDXL, SDXL-Turbo
- VAE, TAESD, LoRA
- Text-to-Image, Image-to-Image, Inpaint
- Latent preview for SD/SDXL (bmp/webp)
- Upscaler Real-ESRGAN x2/x4/anime
- Read and write PNG with metadata
- Optimized for low specs
- Support CPU and GPU
Installation
- Install Python 3.11.x
- Install Python packages (see installer.py or requirements.txt)
- Install Electron
git clone https://github.com/nimadez/mental-diffusion.git
edit src/config.json
Start server
cd mental-diffusion
python src/mdx.py -serv 8011
Start client
cd mental-diffusion
electron src/client/.
Start headless
python mdx.py -p "prompt" -c /sd.safetensors -st 20 -g 7.5 -f img_{seed}
python mdx.py -p "prompt" -mode xl -c /sdxl.safetensors -w 1024 -h 1024 -st 30 -g 8.0 -f img_{seed}
python mdx.py -p "prompt" -pipe img2img -i image.png -sr 0.5
python mdx.py -p "prompt" -pipe inpaint -i image.png -m mask.png
These models are downloaded as needed after launch:
openai/clip-vit-large-patch14 (1.6 GB)
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k (10 MB)
madebyollin/taesd (20 MB)
madebyollin/taesdxl (20 MB)
RealESRGAN_x2plus.pth (65 MB, optional)
RealESRGAN_x4plus.pth (65 MB, optional)
RealESRGAN_x4plus_anime_6B.pth (20 MB, optional)

Command-line
--help show this help message and exit
--server -serv int start websockets server (port is required)
--metadata -meta str /path-to-image.png, extract metadata from PNG
--model -mode str sd/xl, set checkpoint model type (def: config.json)
--pipeline -pipe str txt2img/img2img/inpaint, define pipeline (def: txt2img)
--checkpoint -c str checkpoint .safetensors path (def: config.json)
--vae -v str optional vae .safetensors path (def: null)
--lora -l str optional lora .safetensors path (def: null)
--lorascale -ls float 0.0-1.0, lora scale (def: 1.0)
--scheduler -sc str ddim, ddpm, lcm, pndm, euler_anc, euler, lms (def: config.json)
--prompt -p str positive prompt text input (def: sample)
--negative -n str negative prompt text input (def: empty)
--width -w int width value must be divisible by 8 (def: config.json)
--height -h int height value must be divisible by 8 (def: config.json)
--seed -s int seed number, -1 to randomize (def: -1)
--steps -st int steps from 1 to 100+ (def: 25)
--guidance -g float 0.0-20.0+, how closely linked to the prompt (def: 8.0)
--strength -sr float 0.0-1.0, how much respect the image should pay to the original (def: 1.0)
--image -i str PNG file path or base64 PNG (def: null)
--mask -m str PNG file path or base64 PNG (def: null)
--savefile -sv bool true/false, save image to PNG, contain metadata (def: true)
--onefile -of bool true/false, save the final result only (def: false)
--outpath -o str /path-to-directory (def: .output)
--filename -f str filename prefix (no png extension)
--batch -b int enter number of repeats to run in batch (def: 1)
--preview -pv bool stepping is slower with preview enabled (def: false)
--upscale -up str x2, x4, x4anime (def: null)
[model settings]
SDXL: 1024x1024 or 768x768, --steps >20
SDXL-Turbo: 512x512, --steps 1-4, --guidance 0.0
LCM-LoRA: "lcm" scheduler, --steps 2-8+, --guidance 0.0-2.0
* --server or --batch is recommended because there is no need to reload the checkpoint
* Add "{seed}" to --filename, which will be replaced by seed later
* To load SDXL on 3.5 GB, you need at least 16 GB memory and virtual-memory paging
Websockets Client
- Client fetch config data from server on first launch only (refresh to update)
- You can continue the preview image with img2img pipeline
- The last replay is always available and saveable in .webp format
- Prompts, outpath and filename are saved and retrieved on refresh
- The upscaled image is saved to the file and is not returned
- Right click on the images to open popup menu

Latent Preview

Test LoRA + VAE
* Juggernaut Aftermath, TRCVAE, World of Origami
Test SDXL
* OpenDalleV1.1
Test SDXL-Turbo
* A cinematic shot of a baby racoon wearing an intricate italian priest robe.
Known Issues
:: Stepping is slower with preview enabled
We used the BMP format which has no compression.
Reminder: one solution is to set "pipe._guidance_scale" to 0.0 after 40%
:: Interrupt button does not work
You have to wait for the current step to finish,
The interrupt operation is applied at the beginning of each step,
CTRL+C terminate the server, and as a result, the interrupt button will no longer work.
:: Connection timeout error on loading checkpoint
Your access to HuggingFace is restricted by ISP
Set "http_proxy" in config.json (e.g. "127.0.0.1:8118")
History
↑ Back to the roots (diffusers)
↑ Ported to VS Code
↑ Switch from Diffusers to ComfyUI
↑ Upgrade from sdkit to Diffusers
↑ Undiff renamed to Mental Diffusion
↑ Undiff started with "sdkit"
↑ Created for my personal use

License
Code released under the MIT license.