DeepSIM
DeepSIM copied to clipboard
Infinite Loop Error (keeps starting train.py for some reason)
Hello, I haven't made any modifications to the code. I cloned it, installed the requirements, and ran the script for training. No other options, no custom data.
As you can see, it ran train.py twice for some reason and then got out with a broken pipe error. I've included the error output below. (Output 1)
I then tried debugging this on another machine with the if name == main modification to prevent train.py from calling itself.
It seemed like line 74 from train.py was causing this issue:
This too caused an error, albeit a different one. I've included that one as well (Output 2)
Thank you.
Here's the error Output 1:
(deepsim2) D:\DeepSIM>python ./train.py --dataroot ./datasets/car --primitive seg --no_instance --tps_aug 1 --name DeepSIMCar
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
name DeepSIMCar
[0]
------------ Options -------------
affine_aug: none
batchSize: 1
beta1: 0.5
canny_aug: 0
canny_color: 0
canny_sigma_l_bound: 1.2
canny_sigma_step: 0.3
canny_sigma_u_bound: 3
checkpoints_dir: ./checkpoints
continue_train: False
cutmix_aug: 0
cutmix_max_size: 96
cutmix_min_size: 32
data_type: 32
dataroot: ./datasets/car
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 256
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 256
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: DeepSIMCar
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 8000
niter_decay: 8000
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
primitive: seg
print_freq: 100
resize_or_crop: none
save_epoch_freq: 20000
save_latest_freq: 20000
serial_batches: False
test_canny_sigma: 2
tf_log: False
tps_aug: 1
tps_percent: 0.99
tps_points_per_dim: 3
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
./train.py:11: DeprecationWarning: fractions.gcd() is deprecated. Use math.gcd() instead.
def lcm(a, b): return abs(a * b) / fractions.gcd(a, b) if a and b else 0
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None
for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1
. You can also use weights=VGG19_Weights.DEFAULT
to get the most up-to-date weights.
warnings.warn(msg)
create web directory ./checkpoints\DeepSIMCar\web...
display_delta 0
print_delta 0.0
save_delta 0
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
name DeepSIMCar
[0]
------------ Options -------------
affine_aug: none
batchSize: 1
beta1: 0.5
canny_aug: 0
canny_color: 0
canny_sigma_l_bound: 1.2
canny_sigma_step: 0.3
canny_sigma_u_bound: 3
checkpoints_dir: ./checkpoints
continue_train: False
cutmix_aug: 0
cutmix_max_size: 96
cutmix_min_size: 32
data_type: 32
dataroot: ./datasets/car
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 256
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 256
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: DeepSIMCar
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 8000
niter_decay: 8000
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
primitive: seg
print_freq: 100
resize_or_crop: none
save_epoch_freq: 20000
save_latest_freq: 20000
serial_batches: False
test_canny_sigma: 2
tf_log: False
tps_aug: 1
tps_percent: 0.99
tps_points_per_dim: 3
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None
for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1
. You can also use weights=VGG19_Weights.DEFAULT
to get the most up-to-date weights.
warnings.warn(msg)
create web directory ./checkpoints\DeepSIMCar\web...
display_delta 0
print_delta 0.0
save_delta 0
Traceback (most recent call last):
File "
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Here is (Error Output 2)
(deepsim) PS E:\JM\GAN\deepsim> python ./train.py --dataroot ./datasets/car --primitive seg --no_instance --tps_aug 1 --name DeepSIMCar
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
name DeepSIMCar
[0]
------------ Options -------------
affine_aug: none
batchSize: 1
beta1: 0.5
canny_aug: 0
canny_color: 0
canny_sigma_l_bound: 1.2
canny_sigma_step: 0.3
canny_sigma_u_bound: 3
checkpoints_dir: ./checkpoints
continue_train: False
cutmix_aug: 0
cutmix_max_size: 96
cutmix_min_size: 32
data_type: 32
dataroot: ./datasets/car
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 256
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 256
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: DeepSIMCar
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 8000
niter_decay: 8000
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
primitive: seg
print_freq: 100
resize_or_crop: none
save_epoch_freq: 20000
save_latest_freq: 20000
serial_batches: False
test_canny_sigma: 2
tf_log: False
tps_aug: 1
tps_percent: 0.99
tps_points_per_dim: 3
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
./train.py:16: DeprecationWarning: fractions.gcd() is deprecated. Use math.gcd() instead.
def lcm(a, b): return abs(a * b) / fractions.gcd(a, b) if a and b else 0
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\models_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None
for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1
. You can also use weights=VGG19_Weights.DEFAULT
to get the most up-to-date weights.
warnings.warn(msg)
create web directory ./checkpoints\DeepSIMCar\web...
display_delta 0
print_delta 0.0
save_delta 0
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
Traceback (most recent call last):
File "
I found the problem and it seems like it's fixed for now. Documenting for the future adventurers.
Go to DeepSIM/data/custom_dataset_data_loader.py in your repository.
https://github.com/eliahuhorwitz/DeepSIM/blob/master/data/custom_dataset_data_loader.py
class CustomDatasetDataLoader(BaseDataLoader):
def name(self):
return 'CustomDatasetDataLoader'
def initialize(self, opt):
BaseDataLoader.initialize(self, opt)
if opt.isTrain:
self.dataset = CreateDataset(opt)
else:
self.dataset = CreateDataset_test(opt)
self.dataloader = torch.utils.data.DataLoader(
self.dataset,
batch_size=opt.batchSize,
shuffle=not opt.serial_batches,
num_workers=int(opt.nThreads),
worker_init_fn=lambda _: np.random.seed())
def load_data(self):
return self.
Replace the entirety of line 38 with one parentheses. It seems like the multithreaded thing is messing with everything.
Then, go to train.py and put everything in an
if name == "main": main()
def main: #literally all of train.py here
This worked for me and at least the code is running.
For test.py, you should also put everything in an if name == "main" loop