flownet2-tf icon indicating copy to clipboard operation
flownet2-tf copied to clipboard

Problem with data augmentation

Open junfanlin opened this issue 7 years ago • 5 comments

Hi, Sampepose, Thanks for your great job. While I running the code, I found that it's very difficult for the model with augmentation process to converge. The training loss and test loss is very large(about 50). And if I block out the preprocess code, the model converge very fast, the training loss is about 2. Then I realize that if I set the 'scale' option in dataset_config to be True, the tensorboard can show correct image, while if I set it to be False, the tensorboard failed to show correct image. image

What's more, if I choose to only do the 'translate', 'rotate' and 'zoom' operations, no matter how the 'scale' option is, the tensorboard can show the correct image.

So, I'm wondering if there is anything about the augmentation process should I pay attention to? And would you like put your converging curve here, so can I make sure I'm doing the same training as yours.

My system is ubuntu14.04, and gpu is TitanXP, and I compile your code with gpu compatibility as sm=61.

Expecting your reply. Thanks in advance!

junfanlin avatar Oct 30 '17 05:10 junfanlin

And would you mind put your code of date generation? I found that using your generated tfrecord('sample'), the tensorboard can show normal image but mine generated tfrecord cannot.

below is my code

    compression = tf.python_io.TFRecordCompressionType.ZLIB
    writer = tf.python_io.TFRecordWriter(filename, options=tf.python_io.TFRecordOptions(compression))

    for index in xrange(len(image1s)):
        if index % 1000 == 0:
            print index
        image1 = cv2.imread(image1s[index])
        image2 = cv2.imread(image2s[index])
        _, label = parsingFlo(labels[index])
        image_a = np.float64(image1).tobytes()
        image_b = np.float64(image2).tobytes()
        flow = np.float32(label).tobytes()
        example = tf.train.Example(features=tf.train.Features(feature={
            "image_a":tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_a])),
            "image_b": tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_b])),
            'flow': tf.train.Feature(bytes_list=tf.train.BytesList(value=[flow])),
        }))
        writer.write(example.SerializeToString())  # Serialize To String
    writer.close()

@sampepose THX!

13331151 avatar Oct 30 '17 11:10 13331151

@junfanlin HELLO! I met the same problem as you. It seems that, the model can only learn the background information with the augmentation process . The LOSS of flownet_s is about 50. I'd like to consult you how to solve this problem. Thank you~

yinjunbo avatar Feb 06 '18 11:02 yinjunbo

@junfanlin hi, I meet the same problem, the loss is about 40.... Have you solved the problem ? Thank you

yeshenlin avatar May 30 '18 06:05 yeshenlin

@yeshenlin Hi, I didn't solve it. If you are familiar with pytorch, I suggest you to take a look at NVIDIA's implemenation and refer the data augmentation part . best regard.

junfanlin avatar May 30 '18 15:05 junfanlin

Hello,

The "Scale" option in dataset_configs needs to be True if in the preprocessing phase you need to scale from pixel values (an integer between 0 and 255) to [0, 1] (so basically image = image/255.0). This operation has been done in the script that converts and puts data images to tfrecord files (see scripts/convert_fc_to_tfrecords.py). But if you haven't done this way, you need to put scale = True to do it while training.

dokhanh avatar Feb 27 '19 12:02 dokhanh