DANN_py3 Size of target domain

Size of target domain

Open jayanthsiddamsetty opened this issue 3 years ago • 1 comments

I have a training set with 50k source images and 1k target images. Is DANN a good approach for this use case? If not, what is your recommendation?

Feb 04 '22 15:02 jayanthsiddamsetty

It's decieded not only by the number, but also the similarity between the source and target images, the bigger differences, the more data needed.

May 05 '22 01:05 fungtion

All works I've seen applying this UDA technique considers more or less the same number of data in source and target domains. Now, I am working on a project in which the number of source images is much greater than the target. I am not sure if this is a problem, though.

The only thing is that, by setting num_batches = min(len(train_loader), len(target_loader)) and looping over num_batches as:

for epoch in range(NUM_EPOCHS):
    for batch_index in range(num_batches):
        # forward
        # backward

it would require many "epochs" (maybe calling it "iteration" would be better) to go though the entire training set.

I think it is possible to loop over the entire training set (i.e., num_batches = len(train_loader)), but force the target data to repeat itself multiple times for a given "epoch". To do that, you can use the cycle function from itertools like target_loader = cycle(iter(target_loader)). Then, you could use some data augmentation technique to go around the problem of repeating the target data. Does is make sense?

Sep 21 '22 20:09 rfmiotto

Does is make sense?

Yes, thanks!

Sep 21 '22 20:09 jayanthsiddamsetty

DANN_py3 DANN_py3 copied to clipboard

Size of target domain

DANN_py3
DANN_py3 copied to clipboard