AITemplate Add variable batch to SD compilation

Add variable batch to SD compilation

Open apivovarov opened this issue 2 years ago • 1 comments

Description

This PR allows to compile SD pipeline with variable batch size. H and W are fixed and defined by the model variant. (base - 512 and regular - 768)

Usage example end-to-end:

# Download SD 2.1 base model (512x512)
python3 scripts/download_pipeline.py \
--model-name "stabilityai/stable-diffusion-2-1-base"

# Compile with variable batch size 1..8
python3 scripts/compile.py \
--batch-size 1 8 \
--width 512 --height 512

# Run model with batch 4, for example
python3 scripts/demo.py \
--prompt "a photo of an astronaut riding a horse on mars" \
--batch 4 \
--width 512 --height 512

Testing

Tested compilation.py for batch (1..8) and demo.py / demo_img2img.py for batch 1,4,8. Images quality is good in all cases. Tested on T4 (SM75) and A100 (SM80) GPUs.

Performance impact

I compared two models:

compiled with variable --batch-size 1 8
compiled with static --batch-size 1

demo.py performance is identical for both models if executed with param --batch 1.

Jun 20 '23 04:06 apivovarov

This is redundant. compile_alt.py deals with dynamic shape. There's no need to have dynamic batch support in compile.py. Also, this essentially reverts your changes from #755 and #765, compile.py previously supported setting static batch size for UNet and VAE, and used dynamic batch for CLIP and the removal of that is what first caused your issue.

Also, batch size is dealt with incorrectly in this PR.

batch_size=(
            batch_size[0] * 2,
            batch_size[1] * 2,
        ),  # double batch size for unet

Why is the minimum batch size being doubled? The purpose of doubling the maximum batch size is to support Classifier Free Guidance, it does not make sense to also double the minimum because that removes the ability to perform inference without CFG.

Jun 20 '23 08:06 hlky

AITemplate AITemplate copied to clipboard

Add variable batch to SD compilation

Description

Testing

Performance impact

AITemplate
AITemplate copied to clipboard