DeepMosaics
DeepMosaics copied to clipboard
Training with your own dataset not work
When creating a dataset, I get empty folders, although I added original images and a mask to the required directories according to the instructions when using for example python make_pix2pix_dataset.py --datadir ../datasets/draw/face --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod drawn - minsize 128 --square " turns out: ../datasets/pix2pix/face existed ../datasets/pix2pix/face\train_A existed ../datasets/pix2pix/face\train_B existed segment parameters: 12.4M Find images: 1 it looks like processing is going on, but the folders remain empty. Only the" opt "file and empty" train_a "and" train_b "folders appear
(deep) PS E:\DeepMosaics\train\clean> python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 4,5,6,7 --load_thread 8
checkpoints\face existed
Please run "tensorboard --logdir checkpoints/tensorboardX --host=your_server_ip" and input "2021-06-12_23-06-24" to filter outputs
checkpoints\face existed
Please run "tensorboard --logdir checkpoints/tensorboardX --host=your_server_ip" and input "2021-06-12_23-06-26" to filter outputs
Traceback (most recent call last):
File "
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
When creating dataset, it only "Find images: 1". And if this image don't fit the filter, the output will be empty. You can try more images. And the next problem: multiprocessing in python is difficult to work on Windows, you can try to change the number of processes to 1 or run it on linux.
the number of images does not affect the result.
same issue while the created model, work fine if i use it to add mosaic. (on Linux).
It always stop at 2 image to the end with
Work for video.
what's the filter setting ?
same issue while the created model, work fine if i use it to add mosaic. (on Linux).
It always stop at 2 image to the end with
Work for video.
what's the filter setting ?
no dataset for pix2pix addmosaic is created, empty folders are created command "python make_pix2pix_dataset.py --datadir ../datasets/draw/face --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod drawn --minsize 128 --square " training the model when using a video dataset does not start, the command "python make_video_dataset.py --model_path ../pretrained_models/mosaic/add_face.pth --gpu_id 0 --datadir 'dir for your videos' --savedir ../datasets/video / face " gives out: checkpoints \ face existed Please run "tensorboard --logdir checkpoints / tensorboardX --host = your_server_ip" and input "2021-06-12_23-06-24" to filter outputs checkpoints \ face existed Please run "tensorboard --logdir checkpoints / tensorboardX --host = your_server_ip" and input "2021-06-12_23-06-26" to filter outputs Traceback (most recent call last): File "", line 1, in File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ spawn.py", line 116, in spawn_main exitcode = _main (fd, parent_sentinel) File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ spawn.py", line 125, in _main prepare (preparation_data) File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ spawn.py", line 236, in prepare _fixup_main_from_path (data ['init_main_from_path']) File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path (main_path, File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ runpy.py", line 268, in run_path return _run_module_code (code, init_globals, run_name, File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ runpy.py", line 97, in _run_module_code _run_code (code, mod_globals, init_globals, File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ runpy.py", line 87, in _run_code exec (code, run_globals) File "E: \ DeepMosaics \ train \ clean \ train.py", line 117, in Videodataloader_train = dataloader.VideoDataLoader (opt, videolist_train) File "E: \ DeepMosaics \ train \ clean ../ .. \ util \ dataloader.py", line 115, in init self.load_init () File "E: \ DeepMosaics \ train \ clean ../ .. \ util \ dataloader.py", line 138, in load_init p.start () File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ process.py", line 121, in start self._popen = self._Popen (self) File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ context.py", line 224, in _Popen return _default_context.get_context (). Process._Popen (process_obj) File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ context.py", line 327, in _Popen return Popen (process_obj) File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ popen_spawn_win32.py", line 45, in init prep_data = spawn.get_preparation_data (process_obj._name) File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ spawn.py", line 154, in get_preparation_data _check_not_importing_main () File "C: \ ProgramData \ Anaconda31 \ envs \ deep \ lib \ multiprocessing \ spawn.py", line 134, in _check_not_importing_main raise RuntimeError ('' ' RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support ()
...
The "freeze_support ()" line can be omitted if the program
is not going to be frozen to produce an executable.
i've been able to make the database myself. You put the masks in the A folder, and the source image in the B folder. It work.
To use my mosaic model i plan on making a video consisting of all my sources files with the png or jpg and making the database/training using it in the meantime.
i would like to be able to conform my sources to the filter but i'm lacking data
i've been able to make the database myself. You put the masks in the A folder, and the source image in the B folder. It work.
To use my mosaic model i plan on making a video consisting of all my sources files with the png or jpg and making the database/training using it in the meantime.
i would like to be able to conform my sources to the filter but i'm lacking data
on your advice, the addmosaic workout starts. but the clear mosaic still doesn't work. with the video dataset it turns out:
(deep2) PS E:\DeepMosaics\train\clean> python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 0 --load_thread 1
checkpoints\face existed
Please run "tensorboard --logdir checkpoints/tensorboardX --host=your_server_ip" and input "2021-06-27_04-39-25" to filter outputs
checkpoints\face existed
Please run "tensorboard --logdir checkpoints/tensorboardX --host=your_server_ip" and input "2021-06-27_04-39-29" to filter outputs
Traceback (most recent call last):
File "
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
For image datasets pix2pix doesn't work either. an empty checkpoint / web directory is created:
(deep2) PS E:\DeepMosaics\pix2pixHD> python train.py --name face --resize_or_crop resize_and_crop --loadSize 563 --fineSize 512 --label_nc 0 --no_instance --dataroot ../datasets/pix2pix/face
------------ Options -------------
batchSize: 1
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: False
data_type: 32
dataroot: ../datasets/pix2pix/face
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 512
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 563
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: face
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 100
niter_decay: 100
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
print_freq: 100
resize_or_crop: resize_and_crop
save_epoch_freq: 10
save_latest_freq: 1000
serial_batches: False
tf_log: False
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
train.py:9: DeprecationWarning: fractions.gcd() is deprecated. Use math.gcd() instead.
def lcm(a,b): return abs(a * b)/fractions.gcd(a,b) if a and b else 0
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1260
GlobalGenerator(
(model): Sequential(
(0): ReflectionPad2d((3, 3, 3, 3))
(1): Conv2d(3, 64, kernel_size=(7, 7), stride=(1, 1))
(2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(5): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(6): ReLU(inplace=True)
(7): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(8): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(9): ReLU(inplace=True)
(10): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(11): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(12): ReLU(inplace=True)
(13): Conv2d(512, 1024, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(14): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(15): ReLU(inplace=True)
(16): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(17): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(18): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(19): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(20): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(21): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(22): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(23): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(24): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(25): ConvTranspose2d(1024, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(26): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(27): ReLU(inplace=True)
(28): ConvTranspose2d(512, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(29): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(30): ReLU(inplace=True)
(31): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(32): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(33): ReLU(inplace=True)
(34): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(35): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(36): ReLU(inplace=True)
(37): ReflectionPad2d((3, 3, 3, 3))
(38): Conv2d(64, 3, kernel_size=(7, 7), stride=(1, 1))
(39): Tanh()
)
)
MultiscaleDiscriminator(
(scale0_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(scale1_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(downsample): AvgPool2d(kernel_size=3, stride=2, padding=[1, 1])
)
create web directory ./checkpoints\face\web...
------------ Options -------------
batchSize: 1
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: False
data_type: 32
dataroot: ../datasets/pix2pix/face
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 512
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 563
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: face
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 100
niter_decay: 100
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
print_freq: 100
resize_or_crop: resize_and_crop
save_epoch_freq: 10
save_latest_freq: 1000
serial_batches: False
tf_log: False
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1260
GlobalGenerator(
(model): Sequential(
(0): ReflectionPad2d((3, 3, 3, 3))
(1): Conv2d(3, 64, kernel_size=(7, 7), stride=(1, 1))
(2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(5): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(6): ReLU(inplace=True)
(7): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(8): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(9): ReLU(inplace=True)
(10): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(11): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(12): ReLU(inplace=True)
(13): Conv2d(512, 1024, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(14): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(15): ReLU(inplace=True)
(16): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(17): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(18): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(19): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(20): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(21): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(22): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(23): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(24): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(25): ConvTranspose2d(1024, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(26): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(27): ReLU(inplace=True)
(28): ConvTranspose2d(512, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(29): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(30): ReLU(inplace=True)
(31): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(32): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(33): ReLU(inplace=True)
(34): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(35): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(36): ReLU(inplace=True)
(37): ReflectionPad2d((3, 3, 3, 3))
(38): Conv2d(64, 3, kernel_size=(7, 7), stride=(1, 1))
(39): Tanh()
)
)
MultiscaleDiscriminator(
(scale0_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(scale1_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(downsample): AvgPool2d(kernel_size=3, stride=2, padding=[1, 1])
)
create web directory ./checkpoints\face\web...
Traceback (most recent call last):
Traceback (most recent call last):
File "
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
i've just tested on Windows with the same environnement as my Linux and, i've got the exact same error as you.
i've just tested on Windows with the same environnement as my Linux and, i've got the exact same error as you.
On ubuntu, the video cleaning training also does not work, an empty web folder is created. Pix2pixрhd cleaning training works. creating clean mosaic dataset a drawn mask to make pix2pix(HD) datasets also does not work.
i've just tested on Windows with the same environnement as my Linux and, i've got the exact same error as you.
On ubuntu, the video cleaning training also does not work, an empty web folder is created. Pix2pixрhd cleaning training works. creating clean mosaic dataset a drawn mask to make pix2pix(HD) datasets also does not work.
i've finished my initials add model and made some script to sort out the trash from creating the video dataset.
I just started training with clean video and it's working.
i've modified the environnement a lot though to make it work with my RTX3090 and added some QOL to the base code, i've documented everything in my fork if that's help you.
i've finished my initials add model and made some script to sort out the trash from creating the video dataset.
I just started training with clean video and it's working.
i've modified the environnement a lot though to make it work with my RTX3090 and added some QOL to the base code, i've documented everything in my fork if that's help you.
Thank you. i will check if it works on my rtx 2080
video cleaning training also does not work, processing also does not start (deep1) goger @ goger-System-Product-Name: ~ / DeepMosaicsRT / train / clean $ python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 4,5,6,7 --load_thread 16 makedir: checkpoints / face Please run "tensorboard --logdir checkpoints / tensorboard --host = your_server_ip" and input "2021-06-30_03-53-11" to filter outputs (deep1) goger @ goger-System-Product-Name: ~ / DeepMosaicsRT / train / clean $ nothing has changed, everything remains the same, only the file "events.out.tfevents.1624999991.goger-System-Product-Name"
video cleaning training also does not work, processing also does not start (deep1) goger @ goger-System-Product-Name: ~ / DeepMosaicsRT / train / clean $ python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 4,5,6,7 --load_thread 16 makedir: checkpoints / face Please run "tensorboard --logdir checkpoints / tensorboard --host = your_server_ip" and input "2021-06-30_03-53-11" to filter outputs (deep1) goger @ goger-System-Product-Name: ~ / DeepMosaicsRT / train / clean $ nothing has changed, everything remains the same, only 1 file "events.out.tfevents.1624999991.goger-System-Product-Name"
reduce the number of thread, set gpu_id to 0 if you have only one GPU.
My VM only have 16GB of ram and with 6thread it's saturated and it didn't work with 16 too. Check your video card memory too with nvidia SMI.
python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 0 --load_thread 1 and nothing has changed
@ethanfel
A good job!
I haven't used the code that generates dataset for a long time and it may have some bugs. I look at your code and some changes are very effective.
I will cheek this part in my code.
python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 0 --load_thread 1 and nothing has changed
Check your vram. Video training take more than 16GB when i’m using it so it may be the reason.
@ginpigin I will fix my code and show how to determine some parameters. And whenever you see "freeze_support()" error, it mean you have to make sure ”--load_thread 1“ or run on linux.
python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 0 --load_thread 1 and nothing has changed
Check your vram. Video training take more than 16GB when i’m using it so it may be the reason.
Yes, I konw. When training it, I have to use 4*RTX2080 and it take about one week ...
@ethanfel A good job! I haven't used the code that generates dataset for a long time and it may have some bugs. I look at your code and some changes are very effective. I will cheek this part in my code.
Nice :D I think I have documented everything. I will improve the manage script to automatically rename the last useful folder to removes the gaps created by the rm.
python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 0 --load_thread 1 and nothing has changed
Check your vram. Video training take more than 16GB when i’m using it so it may be the reason.
Yes, I konw. When training it, I have to use 4*RTX2080 and it take about one week ...
Yeah, I think that the vram requirement may mislead people trying to do the video training with one GPU. The network need more than 16GB to even start I think. My other card is a GTX1080 8GB and it doesn’t start there with the same environment.
I’m currently at iter 80.000 on my first video training and the results in tensorboard look awesome. You can be proud of your work it work well.
@ginpigin I will fix my code and show how to determine some parameters. And whenever you see "freeze_support()" error, it mean you have to make sure ”--load_thread 1“ or run on linux.
I am running on ubuntu in an anaconda environment variable. does not issue any errors, it just does not work. an empty folder and file is created events.out.tfevents.1624999991 Of course, I don't know much about this, but will the memory of 4 video cards be summed up? I thought the memory would still be 8GB
@ginpigin I will fix my code and show how to determine some parameters. And whenever you see "freeze_support()" error, it mean you have to make sure ”--load_thread 1“ or run on linux.
I am running on ubuntu in an anaconda environment variable. does not issue any errors, it just does not work. an empty folder and file is created events.out.tfevents.1624999991 Of course, I don't know much about this, but will the memory of 4 video cards be summed up? I thought the memory would still be 8GB
hey after much reading to optimize the use of my RTX, i found this : https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9926-tensor-core-performance-the-ultimate-guide.pdf
And to lower the memory use of the network, lower the size of the batch but keep a multiple of 8, same for load_thread, a multiple of 8.
You also have to tune it to never use swap, it will reduce your iter speed by a lot (/4 for me)
Currently with my RTX3090 i've kept the batch size to 16(2x8) , -load_thread 8 (1x8). It use 21GB of VRAM and 25GB of RAM and the CPU is used at 100%
hey after much reading to optimize the use of my RTX, i found this : https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9926-tensor-core-performance-the-ultimate-guide.pdf
And to lower the memory use of the network, lower the size of the batch but keep a multiple of 8, same for load_thread, a multiple of 8.
You also have to tune it to never use swap, it will reduce your iter speed by a lot (/4 for me)
Currently with my RTX3090 i've increased the batch size to 24 (3x8) , -load_thread 8 (1x8). It use 21GB of VRAM and 25GB of RAM and the CPU is used at 100%
thanks, I'll try, did I understand correctly that I need to put batchsize 8 and load_thread 8 for 8GB?
You have to tune the batch_size to fit your VRAM and the load-thread to fit your RAM.
A smaller batch_size will have a negative impact on the efficiency thought.
You have to tune the batch_size to fit your VRAM and the load-thread to fit your RAM.
A smaller batch_size will have a negative impact on the efficiency thought.
how to?
You have to tune the batch_size to fit your VRAM and the load-thread to fit your RAM. A smaller batch_size will have a negative impact on the efficiency thought.
how to?
it's all in the command line.
python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize **16** --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 0 --load_thread **1**
and nothing has changed
you use nvidia-smi and htop to look at your vram and ram to tune your parameters
you use nvidia-smi and htop to look at your vram and ram to tune your parameters
Total : 8192 MiB
Used : 6721 MiB
Free : 1471 MiB
Well, I looked. 8 gb. what should I do with this information? I am not particularly versed in commands as well as in programming.
you use nvidia-smi and htop to look at your vram and ram to tune your parameters
Total : 8192 MiB Used : 6721 MiB Free : 1471 MiB
Well, I looked. 8 gb. what should I do with this information? I am not particularly versed in commands as well as in programming.
you launch the training and watch the vram/ram, if vram is 100% and the training doesn't start your lower the loadsize, if the ram is 100% and swap increase your lower the load_thread.
you launch the training and watch the vram/ram, if vram is 100% and the training doesn't start your lower the loadsize, if the ram is 100% and swap increase your lower the load_thread.
what parameters to change? batch size and load thread? or what else? within what limits? Is it a multiple of 2 or another number, or does it not matter?