OASIS
OASIS copied to clipboard
Training the model on my own cutom dataset
Thanks for uploading such a very interesting code. I am wondering if is it possible to train the model on my own custom dataset? if yes, what is the procedure?
Hi,
Yes, this is possible, and it should be relatively little effort to implement a custom dataloader.
Step 1:
You should create a file CustomDataset.py
in the dataloaders
folder. Copy-paste all the contents from Ade20k dataloader dataloaders/Ade20kDataset.py
to this file.
Step 2:
Based on the properties of your dataset, you should adjust some parameters in the __init__()
function. They should be:
opt.label_nc
- insert here the number of classes in your dataset (excluding don't care label) (instead of 150).
opt.contain_dontcare_label
- this should be True
if your dataset has a don't care label, and False
otherwise.
opt.semantic_nc
- opt.label_nc + 1
if opt.contain_dontcare_label
is True, otherwise simply same value as opt.label_nc
Step 3:
The function list_images()
should be adjusted to match the structure of your folders. It should return the list of names of images and labels, and a tuple with image and label root folders.
Step 4:,
The new created file should be referenced in /dataloaders/dataloaders.py
.
For this, add there the following two lines:
if mode == "custom":
return "CustomDataset"
to the get_dataset_name()
function.
Things to keep in mind:
- If your dataset has a don't care label, then for correct computation of losses this class should go in front of all other classes, so it should have id=0.
- After implementing it, the program should be called with flag
--dataset_mode custom
Let me know whether this works for you. Regards, Vadim
@SushkoVadim Thank you so much. I did what you suggested and got the following error
--- Now computing Inception activations for real set ---
--- Finished FID stats for real set ---
Created OASIS_Generator with 74314691 parameters
Created OASIS_Discriminator with 22258904 parameters
Traceback (most recent call last):
File "train.py", line 44, in
btw, I am trying to do style transfer from overcast images to rainy images, so I do not have labels images or labels classes, What do you suggest I set the label_nc
to?
@SushkoVadim Update, I changed L156 in models.py
to target_map[target_map == c] = torch.randint(0,2,(1,)).cuda()
, and got the following error:
Created OASIS_Generator with 78461891 parameters Created OASIS_Discriminator with 22268654 parameters /home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py:1628: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead. warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.") /home/izzeddin/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/_functions.py:64: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' [epoch 0/200 - iter 0], time:0.000
I think it is stuck at iteration 0 !
Hi,
Our project is mainly designed for semantic image synthesis, and we haven't tested it for general image-to-image translation. For example, the generator always expects to receive a semantic label map to feed to SPADE layers, while the loss function expects label maps for loss computation.
You can in principle set label_nc
to zero and adapt code to be used without label maps (no SPADE layers, only binary-cross entropy loss), but please note that it would probably require some re-implementation of our functions.
For your last message:
[epoch 0/200 - iter 0], time:0.000
actually means that the network has successfully passed the first training iteration. Did you wait longer until the next message appear?
By default, the program prints such a message every 1000 iterations, this parameter you can set up manually via --freq_print
.
Hi,
Yes, this is possible, and it should be relatively little effort to implement a custom dataloader.
Step 1: You should create a file
CustomDataset.py
in thedataloaders
folder. Copy-paste all the contents from Ade20k dataloaderdataloaders/Ade20kDataset.py
to this file.Step 2: Based on the properties of your dataset, you should adjust some parameters in the
__init__()
function. They should be:
opt.label_nc
- insert here the number of classes in your dataset (excluding don't care label) (instead of 150).opt.contain_dontcare_label
- this should beTrue
if your dataset has a don't care label, andFalse
otherwise.opt.semantic_nc
-opt.label_nc + 1
ifopt.contain_dontcare_label
is True, otherwise simply same value asopt.label_nc
Step 3: The function
list_images()
should be adjusted to match the structure of your folders. It should return the list of names of images and labels, and a tuple with image and label root folders.Step 4:, The new created file should be referenced in
/dataloaders/dataloaders.py
. For this, add there the following two lines:if mode == "custom": return "CustomDataset"
to the
get_dataset_name()
function.Things to keep in mind:
- If your dataset has a don't care label, then for correct computation of losses this class should go in front of all other classes, so it should have id=0.
- After implementing it, the program should be called with flag
--dataset_mode custom
Let me know whether this works for you. Regards, Vadim
@SushkoVadim Thanks a lot for your comprehensive answer.
One minor addition to your Step 2:
When copying the content from Ade20kDataset, do not forget to rename the class from Ade20kDataset
to CustomDataset
Best regards, Lennart
Excuse me. I wanna ask some questions concerning the use of custom datasets. How to define the opt.label_nc? I found that label_nc in your code is different from the introdcation of datasets in paperwithcode(https://paperswithcode.com/dataset/cityscapes). What's more, if my custom dataset's labels are less than the cityscapes (just 29 total, including the class 'others', So I should set the opt.label_nc = 28, semantic_nc = 29 ?), I should define a specific labelcolormap function like cityscapes?
Best wishes
Hi, yes opt.label_nc should be set to the number of semantic classes, and semantic_nc is set to opt.label_nc+1
in case you have a "don't care", "unlabelled", or "other" label. So in your examples, the numbers 28 and 29 should be correct.
Probably, there exist different versions of Cityscapes, for the one we used we observed 35 classes (link).
The colormap is not needed. The dataloader uses a simple representation of label maps with an integer assigned to each pixel. Cityscapes also has label maps without any colormap, which are used by our dataloader.