Dreambooth-Stable-Diffusion icon indicating copy to clipboard operation
Dreambooth-Stable-Diffusion copied to clipboard

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper__index_select)`

Open TemporalLabsLLC-SOL opened this issue 2 years ago • 6 comments

C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loggers\test_tube.py:104: LightningDeprecationWarning: The TestTubeLogger is deprecated since v1.5 and will be removed in v1.7. We recommend switching to the pytorch_lightning.loggers.TensorBoardLoggeras an alternative. rank_zero_deprecation( Monitoring val/loss_simple_ema as checkpoint metric. Merged modelckpt-cfg: {'target': 'pytorch_lightning.callbacks.ModelCheckpoint', 'params': {'dirpath': 'logs\\SUBJECT2022-10-04T06-25-48_DSU90\\checkpoints', 'filename': '{epoch:06}', 'verbose': True, 'save_last': True, 'monitor': 'val/loss_simple_ema', 'save_top_k': 1, 'every_n_train_steps': 500}} GPU available: True, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py:1584: UserWarning: GPU available but not used. Set the gpus flag in your trainerTrainer(gpus=1)or script--gpus=1`. rank_zero_warn(

Data

train, PersonalizedBase, 1500 reg, PersonalizedBase, 15000 validation, PersonalizedBase, 15 accumulate_grad_batches = 1 ++++ NOT USING LR SCALING ++++ Setting learning rate to 1.00e-06 C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:275: LightningDeprecationWarning: The on_keyboard_interrupt callback hook was deprecated in v1.5 and will be removed in v1.7. Please use the on_exception callback hook instead. rank_zero_deprecation( C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:284: LightningDeprecationWarning: Base LightningModule.on_train_batch_start hook signature has changed in v1.5. The dataloader_idx argument will be removed in v1.7. rank_zero_deprecation( C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:291: LightningDeprecationWarning: Base Callback.on_train_batch_end hook signature has changed in v1.5. The dataloader_idx argument will be removed in v1.7. rank_zero_deprecation( C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\core\datamodule.py:469: LightningDeprecationWarning: DataModule.setup has already been called, so it will not be called again. In v1.6 this behavior will change to always call DataModule.setup. rank_zero_deprecation( LatentDiffusion: Also optimizing conditioner params! Project config model: base_learning_rate: 1.0e-06 target: ldm.models.diffusion.ddpm.LatentDiffusion params: reg_weight: 1.0 linear_start: 0.00085 linear_end: 0.012 num_timesteps_cond: 1 log_every_t: 200 timesteps: 1000 first_stage_key: image cond_stage_key: caption image_size: 64 channels: 4 cond_stage_trainable: true conditioning_key: crossattn monitor: val/loss_simple_ema scale_factor: 0.18215 use_ema: false embedding_reg_weight: 0.0 unfreeze_model: true model_lr: 1.0e-06 personalization_config: target: ldm.modules.embedding_manager.EmbeddingManager params: placeholder_strings: - '*' initializer_words: - sculpture per_image_tokens: false num_vectors_per_token: 1 progressive_words: false unet_config: target: ldm.modules.diffusionmodules.openaimodel.UNetModel params: image_size: 32 in_channels: 4 out_channels: 4 model_channels: 320 attention_resolutions: - 4 - 2 - 1 num_res_blocks: 2 channel_mult: - 1 - 2 - 4 - 4 num_heads: 8 use_spatial_transformer: true transformer_depth: 1 context_dim: 768 use_checkpoint: true legacy: false first_stage_config: target: ldm.models.autoencoder.AutoencoderKL params: embed_dim: 4 monitor: val/rec_loss ddconfig: double_z: true z_channels: 4 resolution: 512 in_channels: 3 out_ch: 3 ch: 128 ch_mult: - 1 - 2 - 4 - 4 num_res_blocks: 2 attn_resolutions: [] dropout: 0.0 lossconfig: target: torch.nn.Identity cond_stage_config: target: ldm.modules.encoders.modules.FrozenCLIPEmbedder ckpt_path: C:\Users\Urban\Desktop\textual_inversion-main\models\ldm\sd-v1-4-full-ema.ckpt data: target: main.DataModuleFromConfig params: batch_size: 1 num_workers: 1 wrap: false train: target: ldm.data.personalized.PersonalizedBase params: size: 512 set: train per_image_tokens: false repeats: 100 placeholder_token: dog reg: target: ldm.data.personalized.PersonalizedBase params: size: 512 set: train reg: true per_image_tokens: false repeats: 10 placeholder_token: dog validation: target: ldm.data.personalized.PersonalizedBase params: size: 512 set: val per_image_tokens: false repeats: 10 placeholder_token: dog

Lightning config modelcheckpoint: params: every_n_train_steps: 500 callbacks: image_logger: target: main.ImageLogger params: batch_frequency: 200 max_images: 8 increase_log_steps: false trainer: benchmark: true max_steps: 800 gpus: 0

| Name | Type | Params

0 | model | DiffusionWrapper | 859 M 1 | first_stage_model | AutoencoderKL | 83.7 M 2 | cond_stage_model | FrozenCLIPEmbedder | 123 M

982 M Trainable params 83.7 M Non-trainable params 1.1 B Total params 4,264.941 Total estimated model params size (MB) Validation sanity check: 0it [00:00, ?it/s]C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\data_loading.py:132: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 8 which is the number of cpus on this machine) in theDataLoader` init to improve performance. rank_zero_warn( Validation sanity check: 0%| | 0/2 [00:00<?, ?it/s]Summoning checkpoint.

Traceback (most recent call last): File "main.py", line 838, in trainer.fit(model, data) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit self._call_and_handle_interrupt( File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1199, in _run self._dispatch() File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1279, in _dispatch self.training_type_plugin.start_training(self) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 202, in start_training self._results = trainer.run_stage() File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1289, in run_stage return self._run_train() File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1311, in _run_train self._run_sanity_check(self.lightning_module) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1375, in _run_sanity_check self._evaluation_loop.run() File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run self.advance(*args, **kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 110, in advance dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run self.advance(*args, **kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 122, in advance output = self._evaluation_step(batch, batch_idx, dataloader_idx) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 217, in _evaluation_step output = self.trainer.accelerator.validation_step(step_kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 236, in validation_step return self.training_type_plugin.validation_step(*step_kwargs.values()) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 219, in validation_step return self.model.validation_step(*args, **kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 368, in validation_step _, loss_dict_no_ema = self.shared_step(batch) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 908, in shared_step loss = self(x, c) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 937, in forward c = self.get_learned_conditioning(c) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 595, in get_learned_conditioning c = self.cond_stage_model.encode(c, embedding_manager=self.embedding_manager) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 324, in encode return self(text, **kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 319, in forward z = self.transformer(input_ids=tokens, **kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 297, in transformer_forward return self.text_model( File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 258, in text_encoder_forward hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids, embedding_manager=embedding_manager) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 180, in embedding_forward inputs_embeds = self.token_embedding(input_ids) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\sparse.py", line 158, in forward return F.embedding( File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\functional.py", line 2199, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper__index_select)`

I'm in need of any perspective anybody can give on making this compatible for the right calls on a windows environment where WSL is not an option.

TemporalLabsLLC-SOL avatar Oct 04 '22 12:10 TemporalLabsLLC-SOL

i have the same problem, do you solve it?

add --gpus=1 , it works

xzdong-2019 avatar Dec 26 '22 03:12 xzdong-2019

same problem

howardgriffin avatar Jan 12 '23 06:01 howardgriffin

same problem

XinyangHan avatar Feb 05 '23 23:02 XinyangHan

There are a couple known fixes depending on your specific env. I can compile some links later but use the search function too.

On Wed, Jan 11, 2023, 11:53 PM howardgriffin @.***> wrote:

same problem

— Reply to this email directly, view it on GitHub https://github.com/XavierXiao/Dreambooth-Stable-Diffusion/issues/53#issuecomment-1379887964, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALL43Q4IJ3SA4RAKBINEEHLWR6S7BANCNFSM6AAAAAAQ4RJOIA . You are receiving this because you authored the thread.Message ID: @.***>

TemporalLabsLLC-SOL avatar Feb 06 '23 00:02 TemporalLabsLLC-SOL

@xzdong-2019, may I ask how did you solve that? I mean where should us add gpus=1?

XinyangHan avatar Feb 06 '23 04:02 XinyangHan

@xzdong-2019, may I ask how did you solve that? I mean where should us add gpus=1?

python main.py --gpus 0, --prompt ....

it works to me

Eun0 avatar Jun 28 '23 11:06 Eun0