cog
cog copied to clipboard
Flux.1-dev on 24GB VRAM OOM
I have this predict function:
def predict(self) -> Any:
"""Run a single prediction on the model"""
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
vram = int(torch.cuda.get_device_properties(0).total_memory/(1024*1024*1024))
print("VRAM", vram)
pipe = FluxPipeline.from_pretrained(flux_path, torch_dtype=torch.bfloat16).to(device)
pipe.enable_model_cpu_offload()
prompt = "A cat holding a sign that says hello world"
image = pipe(
prompt,
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")
return "flux-dev.png"
I have 24GB VRAM (the vram variable report 23) on a NVIDIA GeForce RTX 4090.
But when I run sudo cog predict --setup-timeout 3600
I get an Out of Memory error. But flux should be able to run 22GB. I wonder if it is something related to cog/wsl/docker?