RTNet icon indicating copy to clipboard operation
RTNet copied to clipboard

About datasets

Open mindingyao opened this issue 3 years ago • 3 comments

Hello, I want to know why the flipped datasets are four times larger than the original (why not twice), and whether the original data and the flipped data are input together during the training process (Thanks for providing all the training datasets if it's convenient). Also, is it possible to improve the performance by inputting forward and backward OF together? Thanks again!

mindingyao avatar Jul 28 '21 01:07 mindingyao

Hello, I want to know why the flipped datasets are four times larger than the original (why not twice), and whether the original data and the flipped data are input together during the training process (Thanks for providing all the training datasets if it's convenient). Also, is it possible to improve the performance by inputting forward and backward OF together? Thanks again!

Original images, horizontal flip images, Vertical flip images, horizontal and vertical flip images. We don't compare the performance with or without backward OF. The main reason why we use forward and backward OF is that the first or the last optical flow in the video will be black and no information is on such optical flow.

OliverRensu avatar Jul 28 '21 02:07 OliverRensu

Excuse me again, what is the evaluation code you are using? I used the official code (https://github.com/davisvideochallenge/davis-matlab) to evaluate your pre-computed results and found that the result is much higher. But several other methods are all accurate. Thanks!

mindingyao avatar Aug 01 '21 07:08 mindingyao

Hi, we retrain the model after submission and achieve higher performance than that reported in the paper. Feel free to use the score from the paper or the released pretrain model.

OliverRensu avatar Sep 22 '21 16:09 OliverRensu