ColossalAI
ColossalAI copied to clipboard
[BUG]: diffusion infer erroe
🐛 Describe the bug
python scripts/txt2img.py --prompt "Teyvat, Name:Layla, Element: Cryo, Weapon:Sword, Region:Sumeru, Model type:Medium Female, Description:a woman in a blue outfit holding a sword" --plms --outdir output --config 2022-12-02T02-14-03-project.yaml --ckpt last.ckpt
I got the error as:
the code commit id is 6e51d296f07c0ad34d7f85cf9a70d4ceee15ede7 .
I update to : edf4cd46c5395899c795f43bdc3d4a8b16166531 And try again:
And I train to train again in order to get the new checkpoint, but occur a new error:
`oder.layers.13.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.layer_norm2.weight', 'vision_model.encoder.layers.22.mlp.fc2.weight', 'vision_model.encoder.layers.10.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.22.layer_norm2.weight', 'vision_model.encoder.layers.7.mlp.fc1.bias', 'vision_model.encoder.layers.10.layer_norm2.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.weight', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.3.self_attn.out_proj.weight', 'vision_model.encoder.layers.15.mlp.fc1.bias', 'vision_model.encoder.layers.16.mlp.fc2.bias', 'vision_model.encoder.layers.17.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.weight']
- This IS expected if you are initializing CLIPTextModelZero from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CLIPTextModelZero from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Using strategy: pytorch_lightning.strategies.ColossalAIStrategy
Monitoring val/loss_simple_ema as checkpoint metric.
Merged modelckpt-cfg:
{'target': 'lightning.pytorch.callbacks.ModelCheckpoint', 'params': {'dirpath': 'output/2022-12-02T17-05-18_train_colossalaitest/checkpoints', 'filename': '{epoch:06}', 'verbose': True, 'save_last': True, 'monitor': 'val/loss_simple_ema', 'save_top_k': 3}}
Traceback (most recent call last):
File "/home/notebook//code/ColossalAI/examples/images/diffusion/main.py", line 746, in
trainer = Trainer.from_argparse_args(trainer_opt, **trainer_kwargs) File "/opt/conda/envs/ldm1/lib/python3.9/site-packages/lightning/pytorch/trainer/trainer.py", line 1917, in from_argparse_args return from_argparse_args(cls, args, **kwargs) File "/opt/conda/envs/ldm1/lib/python3.9/site-packages/lightning/pytorch/utilities/argparse.py", line 66, in from_argparse_args return cls(**trainer_kwargs) File "/opt/conda/envs/ldm1/lib/python3.9/site-packages/lightning/pytorch/utilities/argparse.py", line 340, in insert_env_defaults return fn(self, **kwargs) File "/opt/conda/envs/ldm1/lib/python3.9/site-packages/lightning/pytorch/trainer/trainer.py", line 408, in init self._accelerator_connector = AcceleratorConnector( File "/opt/conda/envs/ldm1/lib/python3.9/site-packages/lightning/pytorch/trainer/connectors/accelerator_connector.py", line 223, in init self._init_strategy() File "/opt/conda/envs/ldm1/lib/python3.9/site-packages/lightning/pytorch/trainer/connectors/accelerator_connector.py", line 671, in _init_strategy raise RuntimeError(f"{self.strategy} is not valid type: {self.strategy}") AttributeError: 'AcceleratorConnector' object has no attribute 'strategy'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/notebook//code/ColossalAI/examples/images/diffusion/main.py", line 829, in
These mistakes drive me crazy! Could you tell me which is a stable version or give me a commitid that you have check ok?
Environment
No response