DeepSpeed
DeepSpeed copied to clipboard
[BUG] 'StableDiffusionPipeline' object has no attribute 'children'
I not luck ,get 'StableDiffusionPipeline' object has no attribute 'children' , may be use wrong diffuers version ? diffusers 0.13.1 deepspeed 0.8.2
File "/opt/conda/lib/python3.8/site-packages/deepspeed/module_inject/auto_tp.py", line 19, in get_module_list AttributeError: 'StableDiffusionPipeline' object has no attribute 'children' 'StableDiffusionPipeline' object has no attribute 'children'
Hi @stevensu1977 , can you provide the script you used?
same here. My scripts looks like below
from diffusers import StableDiffusionPipeline
import deepspeed
....
async def startup_event():
app.state.pipe = StableDiffusionPipeline.from_pretrained(
settings.model_name_or_path,
revision="fp16",
torch_dtype=torch.float16,
).to(settings.device)
deepspeed.init_inference(
model=getattr(app.state.pipe, "model", app.state.pipe), # Transformers models
mp_size=1, # Number of GPU
dtype=torch.float16, # dtype of the weights (fp16)
replace_method="auto", # Lets DS autmatically identify the layer to replace
replace_with_kernel_inject=False, # replace the model with the kernel injector
)
async def generate(request: GenerationRequest):
with torch.inference_mode():
generated_images = app.state.pipe(
prompt=request.prompts,
num_inference_steps=request.num_inference_steps,
guidance_scale=request.guidance_scale,
negative_prompt=request.negative_prompts,
num_images_per_prompt=request.num_images_per_prompt,
)
img_list = [from_image_to_bytes(generated_image) for generated_image in generated_images.images]
return JSONResponse(img_list)
I used Python 3.10 with requirements settings below
fastapi==0.86.0
pydantic
uvicorn==0.19.0
accelerate
diffusers
torch
transformers
deepspeed
triton==2.0.0
File "/./app.py", line 51, in startup_event
deepspeed.init_inference(
File "/usr/local/lib/python3.10/site-packages/deepspeed/__init__.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/usr/local/lib/python3.10/site-packages/deepspeed/inference/engine.py", line 139, in __init__
parser_dict = AutoTP.tp_parser(model)
File "/usr/local/lib/python3.10/site-packages/deepspeed/module_inject/auto_tp.py", line 98, in tp_parser
module_list = AutoTP.get_module_list(model)
File "/usr/local/lib/python3.10/site-packages/deepspeed/module_inject/auto_tp.py", line 19, in get_module_list
for child in model.children():
AttributeError: 'StableDiffusionPipeline' object has no attribute 'children'
It looks like the bug could only be reproduced in version 0.8.2 or later. (the bug does not rely on the version of diffusers)
I am running into the same issue (using diffusers==0.14.0) @sjkoo1989 which version of DeepSpeed were you able to run?
On my end 0.8.1 fails with
File "/home/ubuntu/DeepSpeed-MII/.venv/lib/python3.8/site-packages/deepspeed/module_inject/auto_tp.py", line 35, in supported
if key.group(1).lower() in unsupported:
AttributeError: 'NoneType' object has no attribute 'group'
And 0.8.0 and 0.7.7 fail with
AttributeError: module 'diffusers.models.vae' has no attribute 'AutoencoderKL'
There is a bit more progress after reverting to diffusers 0.11.1 and deepspeed 0.8.0
The server loads now but crashes at inference in _fwd_kernel
qk += tl.dot(q, k, trans_b=True)
^
Hi @stevensu1977 and @gaziqbal, can you try setting kernel injection to True?
@molly-smith Loading the model works for me with the following packages:
accelerate==0.17.0
deepspeed==0.8.2
diffusers==0.14.0
transformers==4.26.1
triton==2.0.0
torch==1.13.1
I run the pipeline like this:
deepspeed.init_inference(pipe.to("cuda"), dtype=torch.float16, replace_with_kernel_inject=True, enable_cuda_graph=True)
But during inference it fails with the following error:
TypeError: DSUNet._forward() got an unexpected keyword argument 'cross_attention_kwargs'
Likewise as @BogdanDarius - if I explicitly set config.replace_with_kernel_inject = True
in InferenceEngine.init then the model (CompVis/stable-diffusion-v1-4) loads but still crashes on inference.
diffusers 0.14.0 and 0.13.0 crash with the following error
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "Exception calling application: _forward() got an unexpected keyword argument 'cross_attention_kwargs'"
debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:50050 {created_time:"2023-03-14T00:04:46.791932314+00:00", grpc_status:2, grpc_message:"Exception calling application: _forward() got an unexpected keyword argument \'cross_attention_kwargs\'"}"
diffuser 0.11.1 crash with the same error as above https://github.com/microsoft/DeepSpeed/issues/2968#issuecomment-1466914276
Follow version settings work
accelerate
diffusers==0.6.0
torch
transformers[sentencepiece]==4.24.0
deepspeed==0.7.4
triton==2.0.0.dev20221030
disregard PR https://github.com/microsoft/DeepSpeed/pull/3083
Hey @molly-smith , I met exactly the same error with @BogdanDarius . and updated deepspeed/inference/engine.py
based on PR https://github.com/microsoft/DeepSpeed/pull/3083, but still has this error:
File "/opt/conda/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 667, in __call__
noise_pred = self.unet(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/model_implementations/diffusers/unet.py", line 41, in forward
return self._forward(*inputs, **kwargs)
TypeError: DSUNet._forward() got an unexpected keyword argument 'cross_attention_kwargs'
Thank you!
Hey @molly-smith , if I use model = deepspeed.init_inference(pipe.to("cuda"), dtype=torch.float16)
to disable kernel injection, I met the previous error 'StableDiffusionPipeline' object has no attribute 'children'
[2023-03-27 02:21:13,610] [INFO] [logging.py:93:log_dist] [Rank -1] DeepSpeed info: version=0.8.3, git-hash=unknown, git-branch=unknown
[2023-03-27 02:21:13,611] [INFO] [logging.py:93:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
Traceback (most recent call last):
File "/test.py", line 17, in <module>
model = deepspeed.init_inference(pipe.to("cuda"), dtype=torch.float16)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/__init__.py", line 311, in init_inference
engine = InferenceEngine(model, config=ds_inference_config)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/inference/engine.py", line 139, in __init__
parser_dict = AutoTP.tp_parser(model)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/module_inject/auto_tp.py", line 98, in tp_parser
module_list = AutoTP.get_module_list(model)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/module_inject/auto_tp.py", line 19, in get_module_list
for child in model.children():
AttributeError: 'StableDiffusionPipeline' object has no attribute 'children'
followed every step of this thread, and landed in the same error as @JacquiML
Hi all, sorry for the delay. The error 'StableDiffusionPipeline' object has no attribute 'children'
is because you need to enable kernel injection.
The error TypeError: DSUNet._forward() got an unexpected keyword argument 'cross_attention_kwargs'
is being caused by changes in the latest releases of diffusers (versions 0.13.0 and above). Working on a fix. In the meantine, let me know if enabling kernel injection and using diffusers 0.12.0 or below works for you. Thanks.
Hey @molly-smith , I have downgraded diffusers version to 0.11.1
, and then met the error below. Thanks for your support ahead!
Time to load spatial_inference op: 21.2737238407135 seconds
**** found and replaced unet w. <class 'deepspeed.model_implementations.diffusers.unet.DSUNet'>
0%| | 0/50 [00:00<?, ?it/s]------------------------------------------------------
Free memory : 18.335083 (GigaBytes)
Total memory: 22.199097 (GigaBytes)
Requested memory: 1.015625 (GigaBytes)
Setting maximum total tokens (input + output) to 4096
------------------------------------------------------
0%| | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 21, in _fwd_kernel
KeyError: ('2-.-0-.-0-83ca8b715a9dc5f32dc1110973485f64-d6252949da17ceb5f3a278a70250af13-3b85c7bef5f0a641282f3b73af50f599-3d2aedeb40d6d81c66a42791e268f98b-3498c340fd4b6ee7805fd54b882a04f5-e1f133f98d04093da2078dfc51c36b72-b26258bf01f839199e39d64851821f26-d7c06e3b46e708006c15224aac7a1378-f585402118c8a136948ce0a49cfe122c', (torch.float16, torch.float16, torch.float16, 'fp32', torch.float32, torch.float16, 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32'), (128, 64, 128), (True, True, True, (False,), True, True, (True, False), (True, False), (True, False), (False, True), (True, False), (True, False), (True, False), (False, True), (True, False), (True, False), (True, False), (False, True), (True, False), (True, False), (True, False), (False, True), (False, False), (False, False), (True, False)))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 937, in build_triton_ir
generator.visit(fn.parse())
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 855, in visit
return super().visit(node)
File "/opt/conda/lib/python3.10/ast.py", line 418, in visit
return visitor(node)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 183, in visit_Module
ast.NodeVisitor.generic_visit(self, node)
File "/opt/conda/lib/python3.10/ast.py", line 426, in generic_visit
self.visit(item)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 855, in visit
return super().visit(node)
File "/opt/conda/lib/python3.10/ast.py", line 418, in visit
return visitor(node)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 252, in visit_FunctionDef
has_ret = self.visit_compound_statement(node.body)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 177, in visit_compound_statement
self.last_ret_type = self.visit(stmt)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 855, in visit
return super().visit(node)
File "/opt/conda/lib/python3.10/ast.py", line 418, in visit
return visitor(node)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 678, in visit_For
self.visit_compound_statement(node.body)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 177, in visit_compound_statement
self.last_ret_type = self.visit(stmt)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 855, in visit
return super().visit(node)
File "/opt/conda/lib/python3.10/ast.py", line 418, in visit
return visitor(node)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 319, in visit_AugAssign
self.visit(assign)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 855, in visit
return super().visit(node)
File "/opt/conda/lib/python3.10/ast.py", line 418, in visit
return visitor(node)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 301, in visit_Assign
values = self.visit(node.value)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 855, in visit
return super().visit(node)
File "/opt/conda/lib/python3.10/ast.py", line 418, in visit
return visitor(node)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 339, in visit_BinOp
rhs = self.visit(node.right)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 855, in visit
return super().visit(node)
File "/opt/conda/lib/python3.10/ast.py", line 418, in visit
return visitor(node)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 797, in visit_Call
return fn(*args, _builder=self.builder, **kws)
File "/opt/conda/lib/python3.10/site-packages/triton/impl/base.py", line 22, in wrapper
return fn(*args, **kwargs)
TypeError: dot() got an unexpected keyword argument 'trans_b'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/root/performance_optimisation/deepspeed/generate_image_benchmark.py", line 20, in <module>
print(model(prompt))
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/inference/engine.py", line 562, in forward
outputs = self.module(*inputs, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 529, in __call__
noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/model_implementations/diffusers/unet.py", line 41, in forward
return self._forward(*inputs, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/model_implementations/diffusers/unet.py", line 63, in _forward
return self.unet(sample, timestamp, encoder_hidden_states, return_dict)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 424, in forward
sample, res_samples = downsample_block(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 777, in forward
hidden_states = attn(hidden_states, encoder_hidden_states=encoder_hidden_states).sample
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/diffusers/models/attention.py", line 216, in forward
hidden_states = block(hidden_states, encoder_hidden_states=encoder_hidden_states, timestep=timestep)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/ops/transformer/inference/diffusers_transformer_block.py", line 106, in forward
out_attn_1 = self.attn_1(out_norm_1)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/ops/transformer/inference/diffusers_attention.py", line 228, in forward
output = DeepSpeedDiffusersAttentionFunction.apply(
File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/opt/conda/lib/python3.10/site-packages/deepspeed/ops/transformer/inference/diffusers_attention.py", line 117, in forward
output = selfAttention_fp(input, context, input_mask)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/ops/transformer/inference/diffusers_attention.py", line 81, in selfAttention_fp
context_layer = triton_flash_attn_kernel(qkv_out[0],
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/deepspeed/ops/transformer/inference/triton_ops.py", line 120, in forward
_fwd_kernel[grid](
File "<string>", line 41, in _fwd_kernel
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 1621, in compile
next_module = compile(module)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 1550, in <lambda>
lambda src: ast_to_ttir(src, signature, configs[0], constants)),
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 962, in ast_to_ttir
mod, _ = build_triton_ir(fn, signature, specialization, constants)
File "/opt/conda/lib/python3.10/site-packages/triton/compiler.py", line 942, in build_triton_ir
raise CompilationError(fn.src, node) from e
triton.compiler.CompilationError: at 58:24:
def _fwd_kernel(
Q,
K,
V,
sm_scale,
TMP,
Out,
stride_qz,
stride_qh,
stride_qm,
stride_qk,
stride_kz,
stride_kh,
stride_kn,
stride_kk,
stride_vz,
stride_vh,
stride_vk,
stride_vn,
stride_oz,
stride_oh,
stride_om,
stride_on,
Z,
H,
N_CTX,
BLOCK_M: tl.constexpr,
BLOCK_DMODEL: tl.constexpr,
BLOCK_N: tl.constexpr,
):
start_m = tl.program_id(0)
off_hz = tl.program_id(1)
# initialize offsets
offs_m = start_m * BLOCK_M + tl.arange(0, BLOCK_M)
offs_n = tl.arange(0, BLOCK_N)
offs_d = tl.arange(0, BLOCK_DMODEL)
off_q = off_hz * stride_qh + offs_m[:, None] * stride_qm + offs_d[None, :] * stride_qk
off_k = off_hz * stride_kh + offs_n[:, None] * stride_kn + offs_d[None, :] * stride_kk
off_v = off_hz * stride_vh + offs_n[:, None] * stride_qm + offs_d[None, :] * stride_qk
# Initialize pointers to Q, K, V
q_ptrs = Q + off_q
k_ptrs = K + off_k
v_ptrs = V + off_v
# initialize pointer to m and l
t_ptrs = TMP + off_hz * N_CTX + offs_m
m_i = tl.zeros([BLOCK_M], dtype=tl.float32) - float("inf")
l_i = tl.zeros([BLOCK_M], dtype=tl.float32)
acc = tl.zeros([BLOCK_M, BLOCK_DMODEL], dtype=tl.float32)
# load q: it will stay in SRAM throughout
q = tl.load(q_ptrs)
# loop over k, v and update accumulator
for start_n in range(0, N_CTX, BLOCK_N):
start_n = tl.multiple_of(start_n, BLOCK_N)
# -- compute qk ----
k = tl.load(k_ptrs + start_n * stride_kn)
qk = tl.zeros([BLOCK_M, BLOCK_N], dtype=tl.float32)
qk += tl.dot(q, k, trans_b=True)
For diffusers 0.13.0 or above. please try https://github.com/microsoft/DeepSpeed/pull/3142
@JacquiML , I think you may need a different triton version. It should be triton 2.0.0.dev20221202
Hey @molly-smith , thanks for the replies above!
For diffusers 0.13.0 or above. please try https://github.com/microsoft/DeepSpeed/pull/3142
Great! It works on diffusers 0.14.0
@JacquiML , I think you may need a different triton version. It should be triton 2.0.0.dev20221202
Yep, It works on diffusers 0.14.0 with triton 2.0.0.dev20221202. But triton 2.0.0.dev20221202 needs PyTorch 1.13.1, it will downgrade PyTorch 2.0 to PyTorch 1.13.1 when installing triton 2.0.0.dev20221202. Could DeepSpeed support PyTorch 2.0 too? Thanks!
Hey @molly-smith , a follow-up question: do you know an ETA when this merged PR: https://github.com/microsoft/DeepSpeed/pull/3142 will be in included in a new DeepSpeed release version in PyPi? Many thanks!
Hi @molly-smith , I was using diffusers 0.14, deepspeed 0.9.0, pytorch 1.13 but still got this error.
Hi all, sorry for the delay. The error
'StableDiffusionPipeline' object has no attribute 'children'
is because you need to enable kernel injection.The error
TypeError: DSUNet._forward() got an unexpected keyword argument 'cross_attention_kwargs'
is being caused by changes in the latest releases of diffusers (versions 0.13.0 and above). Working on a fix. In the meantine, let me know if enabling kernel injection and using diffusers 0.12.0 or below works for you. Thanks.
@molly-smith Thanks for your support. However, after I enable kernel injection, the error became
" module 'diffusers.models.attention' has no attribute 'CrossAttention' ”.
diffusers version was 0.15.0 Is there any solution to this?
Besides, I downgrade diffusers to 0.11.1, the model was successfullly loaded, but during inference, it shows:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
@jy00161yang, this fix is for diffusers 0.13.0 and 0.14.0. Diffusers 0.15.0 was released after this fix. I will work on a fix for 0.15.0 soon.
Hey @molly-smith , a follow-up question: do you know an ETA when this merged PR: #3142 will be in included in a new DeepSpeed release version in PyPi? Many thanks!
@JacquiML Should be merged in DeepSpeed v0.9.0
@jy00161yang
@molly-smith Thanks for your support. However, after I enable kernel injection, the error became " module 'diffusers.models.attention' has no attribute 'CrossAttention' ”. diffusers version was 0.15.0 Is there any solution to this?
I got the same problem. Did you manage to resolve it?
you can add parameters in deepspeed.init_inference(), replace_with_kernel_inject=True, and i fix the bug in the same problem