pix2pixHD
pix2pixHD copied to clipboard
How to generate different "styles"?
In the readme there are examples of a single input image where the network is able to generate multiple versions (styles) during inference. I have successfully trained my model but I don't know how to achieve that with the given scripts (esp. test.py
).
Thanks a lot in advance!
This can be done by using the feat.sh scripts, where you concatenate some feature vector as input.
Oh I see. Thanks for answering so quickly!
I have some questions regarding the scripts and it would be so awesome if you could provide some guidance :)
- I don't need to precompute feature maps when initially training in 512p, right? (like in
train_512p_feat.sh
) - I can't reuse already trained 512p models without the
--instance_feat
flag, right? - Do I need to provide instance-maps in this case or can I use
--instance_feat
in conjunction with--no_instance
? - This might sound silly but how actually does the model decide which "style" to use in
test_512p_feat.sh
? Or does this generate multiple outcomes and if yes how many?
- No you don't
- I think you can finetune using that, which should be faster than retraining from scratch
- If you don't have instance maps, please use
label_feat
instead - It will precompute some clusters on the training images, then randomly select a cluster at inference.
Thanks, once again :) So I need to input additional "feature vectors" for training with --label_feat
? How would I generate them and in which format?
Currently, when I'm just trying to train like usual (with train_A
and train_B
folders) and add the --label_feat
option I'm getting an exception:
------------ Options -------------
batchSize: 1
beta1: 0.5
checkpoints_dir: ./pix2pixHD/checkpoints/
continue_train: False
data_type: 32
dataroot: ./pix2pixHD/dataset/
debug: False
display_freq: 50
display_winsize: 512
feat_num: 3
fineSize: 512
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: True
label_nc: 0
lambda_feat: 10.0
loadSize: 512
load_features: False
load_pretrain:
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: label2city_512p_feat
ndf: 64
nef: 16
netG: local
ngf: 32
niter: 100
niter_decay: 100
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
print_freq: 50
resize_or_crop: none
save_epoch_freq: 10
save_latest_freq: 500
serial_batches: False
tf_log: False
use_dropout: False
verbose: True
which_epoch: latest
-------------- End ----------------
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 12353
LocalEnhancer(
(model): Sequential(
(0): ReflectionPad2d((3, 3, 3, 3))
(1): Conv2d(6, 64, kernel_size=(7, 7), stride=(1, 1))
(2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(5): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(6): ReLU(inplace)
(7): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(8): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(9): ReLU(inplace)
(10): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(11): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(12): ReLU(inplace)
(13): Conv2d(512, 1024, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(14): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(15): ReLU(inplace)
(16): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(17): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(18): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(19): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(20): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(21): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(22): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(23): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(24): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(25): ConvTranspose2d(1024, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(26): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(27): ReLU(inplace)
(28): ConvTranspose2d(512, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(29): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(30): ReLU(inplace)
(31): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(32): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(33): ReLU(inplace)
(34): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(35): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(36): ReLU(inplace)
)
(model1_1): Sequential(
(0): ReflectionPad2d((3, 3, 3, 3))
(1): Conv2d(6, 32, kernel_size=(7, 7), stride=(1, 1))
(2): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(5): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(6): ReLU(inplace)
)
(model1_2): Sequential(
(0): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(1): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(2): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(3): ConvTranspose2d(64, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(4): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(5): ReLU(inplace)
(6): ReflectionPad2d((3, 3, 3, 3))
(7): Conv2d(32, 3, kernel_size=(7, 7), stride=(1, 1))
(8): Tanh()
)
(downsample): AvgPool2d(kernel_size=3, stride=2, padding=[1, 1])
)
MultiscaleDiscriminator(
(scale0_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace)
)
(scale0_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace)
)
(scale0_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace)
)
(scale0_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace)
)
(scale0_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(scale1_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace)
)
(scale1_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace)
)
(scale1_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace)
)
(scale1_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace)
)
(scale1_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(downsample): AvgPool2d(kernel_size=3, stride=2, padding=[1, 1])
)
Encoder(
(model): Sequential(
(0): ReflectionPad2d((3, 3, 3, 3))
(1): Conv2d(3, 16, kernel_size=(7, 7), stride=(1, 1))
(2): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace)
(4): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(5): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(6): ReLU(inplace)
(7): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(8): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(9): ReLU(inplace)
(10): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(11): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(12): ReLU(inplace)
(13): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(14): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(15): ReLU(inplace)
(16): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(17): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(18): ReLU(inplace)
(19): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(20): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(21): ReLU(inplace)
(22): ConvTranspose2d(64, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(23): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(24): ReLU(inplace)
(25): ConvTranspose2d(32, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(26): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(27): ReLU(inplace)
(28): ReflectionPad2d((3, 3, 3, 3))
(29): Conv2d(16, 3, kernel_size=(7, 7), stride=(1, 1))
(30): Tanh()
)
)
---------- Networks initialized -------------
model [Pix2PixHDModel] was created
create web directory ./pix2pixHD/checkpoints/label2city_512p_feat/web...
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f378df9f128>>
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 399, in __del__
self._shutdown_workers()
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
self.worker_result_queue.get()
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/queues.py", line 337, in get
return _ForkingPickler.loads(res)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
fd = df.detach()
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/resource_sharer.py", line 58, in detach
return reduction.recv_handle(conn)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/reduction.py", line 182, in recv_handle
return recvfds(s, 1)[0]
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/reduction.py", line 153, in recvfds
msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size))
ConnectionResetError: [Errno 104] Connection reset by peer
Traceback (most recent call last):
File "./pix2pixHD/train.py", line 61, in <module>
Variable(data['image']), Variable(data['feat']), infer=save_fake)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/project/pix2pixHD/models/pix2pixHD_model.py", line 162, in forward
feat_map = self.netE.forward(real_image, inst_map)
File "/home/ubuntu/project/pix2pixHD/models/networks.py", line 289, in forward
output_ins = outputs[indices[:,0] + b, indices[:,1] + j, indices[:,2], indices[:,3]]
IndexError: index 1 is out of bounds for dimension 1 with size 1
Command:
python ./pix2pixHD/train.py --verbose \
--loadSize 512 --fineSize 512 \
--resize_or_crop none \
--label_nc 0 \
--label_feat \
--no_instance \
--name label2city_512p_feat \
--checkpoints_dir "./pix2pixHD/checkpoints/"\
--dataroot "./pix2pixHD/dataset/"\
--save_latest_freq 500 \
--display_freq 50 --print_freq 50 \
--netG local --ngf 32
Would be 🔥 if you could help me with some more guidance :)
Regards from Germany
@wottpal did you manage to generate good images with different styles?
@wottpal HI, I have the same problem. If you solve it, can you please teach me?
same problem, not enough docs on this
@wottpal same here. did you solve this problem?
Sorry @ShaniGam @shen113 @daeunni, but never solved it.
@tcwang0509 how can i solve this same problem?
I changed
output_ins = outputs[indices[:,0] + b, indices[:,1] + j, indices[:,2], indices[:,3]]
to output_ins = outputs[indices[:,0] + b, indices[:,1], indices[:,2], indices[:,3]]
outputs_mean[indices[:,0] + b, indices[:,1] + j, indices[:,2], indices[:,3]] = mean_feat
to outputs_mean[indices[:,0] + b, indices[:,1], indices[:,2], indices[:,3]] = mean_feat
in networks.py(maybe 287 line) and it works fine.