ml-stable-diffusion Swift generation produces different style/quality images compared to other SD tools

Swift generation produces different style/quality images compared to other SD tools

Open kasima opened this issue 1 year ago • 3 comments

I've been experimenting with image generation in Swift with the converted CoreML models. It seems to produce different style (and noticeably worse?) images than other Stable Diffusion tools for a given model version and set of generation parameters. The python CLI generation with the converted CoreML models seems to produce images that are in the same vicinity as the other tools.

I'm new to AI image generation space and would much appreciate any help with a few questions:

Does the last row of images (Swift CLI) seem like it's very different from the other rows to anyone else? It's somewhat subjective.
Why would the generation with Swift consistently produce different quality/style/coherence than all the other tools?
How can I get the Swift generation with the Neural Engine to produce similar quality images as the other Stable Diffusion tools?

Here's what I've been looking at:

Parameters

Model: a version of Stable Diffusion 1.5
Prompt: personification of Halloween holiday in the form of a cute girl with short hair and a villain's smile, cute hats, cute cheeks, unreal engine, highly detailed, artgerm digital illustration, woo tooth, studio ghibli, deviantart, sharp focus, artstation, by Alexei Vinogradov bakery, sweets, emerald eyes (Inspired by this image)
Steps: 30
Guidance: 10
Seed: random

Tool
DreamStudio
Google Colab
DiffusionBee (local)
InvokeAI (local)
python CLI (local coreML)
swift CLI (local coreML)

Python CLI generation command

pre-converted model from huggingface

python -m python_coreml_stable_diffusion.pipeline --prompt "personification of Halloween holiday in the form of a cute girl with short hair and a villain's smile, cute hats, cute cheeks, unreal engine, highly detailed, artgerm digital illustration, woo tooth, studio ghibli, deviantart, sharp focus, artstation, by Alexei Vinogradov bakery, sweets, emerald eyes" -i /Users/kasima/src/huggingface/apple/coreml-stable-diffusion-v1-5/original/packages -o /Users/kasima/scratch --compute-unit ALL --model-version "runwayml/stable-diffusion-v1-5" --num-inference-steps 30 --guidance-scale 10

Swift CLI generation command

pre-converted model from huggingface

swift run StableDiffusionSample "personification of Halloween holiday in the form of cute girl with short hair and a villain's smile, cute hats, cute cheeks, unreal engine, highly detailed, artgerm digital illustration, woo tooth, studio ghibli, deviantart, sharp focus, artstation, by Alexei Vinogradov bakery, sweets, emerald eyes" --negative-prompt "" --resource-path /Users/kasima/src/huggingface/apple/coreml-stable-diffusion-v1-5/split_einsum/compiled/ --output-path /Users/kasima/scratch/swiftcli/comparison --step-count 30 --guidance-scale 10 --image-count 4

Dec 29 '22 14:12 kasima

@kasima maybe this is a silly question but I see no mention of the seeds used to generate these images, did you use the same to generate each column on these examples? and if you did, could you share them so we can try to replicate this results?

Dec 29 '22 16:12 GuiyeC

The issue is that currently this CoreML implementation only supports 2 samplers. The normal SD tools all support other samplers, like Euler, etc, which I found produce great results.

We'll have to wait until apple implement other samplers in this codebase.

Dec 29 '22 20:12 H1p3ri0n

@GuiyeC – All the images were generated with random seeds (updated in the original post). The images in the columns aren't necessarily related to each other. Columns were used for formatting. However, it's an interesting idea to try to keep the seeds the same. Will try that when I get a chance.

@timevision – So the samplers/schedulers might have something to do with it? I believe at least a few of them are using the default PNDM scheduler (Google Colab, Python CLI, Swift CLI, and probably Diffusion Bee as well). Will confirm and try regenerating with the same scheduler.

Dec 30 '22 04:12 kasima

ml-stable-diffusion ml-stable-diffusion copied to clipboard

Swift generation produces different style/quality images compared to other SD tools

Parameters

Python CLI generation command

Swift CLI generation command

ml-stable-diffusion
ml-stable-diffusion copied to clipboard