deepmind-research
deepmind-research copied to clipboard
Perceiver IO training optical flow
Do you guys provide a way to train optical flow models like you do for imagenet? If not, could you summarize the steps I would have to take to train optical flow model using the scripts provided?
Hi @dataplayer12,
Currently we do not plan to open source the script we used to train the optical flow models, because they depend heavily on our internal infrastructure and unreleased AutoFlow code. To adapt the provided ImageNet training script to optical flow, you will need to do the following:
- Change the dataset to AutoFlow (https://autoflow-google.github.io/#data).
- Add a curriculum. We used the default curriculum for AutoFlow, which gradually increases the severity of augmentations over time, with an additional phase where we completely disable all augmentations. Furthermore, during this phase, we feed each image pair twice in a batch, once forward and once reversed. The code for AutoFlow is not yet open sourced, so please refer to the Supplemental section of the AutoFlow paper for implementation details.
- Use Optical Flow Perceiver IO model instead of the ImageNet one. The optical flow colab contains the exact model that we used.
For optimal results, you should also adapt our script to run on your distributed training setup, in order to train with a large enough batch size. Please refer to Appendix E of our paper for more details, and let us know if you have any further questions.
@fding Thanks for the nice reply. Wonder how it was evaluated on sintel and KITTI? Perceiver IO seems not to appear in both leaderboards. Did you have internal channels for the evaluation?
Hi @askerlee,
We evaluated on the training split for sintel and KITTI, which was also used to generate the results in Table 1 of the Autoflow paper. Note that we do not train on Sintel and Kitti, unlike most entries on the leaderboards.
Thank you @fding ! This is very helpful.
@fding How critical is batch size, in getting it to work? Have you tried with a batch size like 16 or 32, instead of 512? (I am trying it now, without much success.)
Hi @aharley, we never tried a smaller batch size.
Note that we used a curriculum to train the optical flow model. In addition to the default Autoflow curriculum, we added another phase at the very beginning of training, where we disable augmentations and feed every image pair twice in a batch, once forward and once with the images reversed. We approximate the inverse flow by using the average of all flows terminating at any given pixel.
Hi~ @fding the author of AutoFlow (https://autoflow-google.github.io/#data) now provide a Static dataset with 40,000 training examples; but I notice that in the Appendix E of Perceiver IO: "In all cases, we train on the AutoFlow dataset [74], which consists of 400, 000 image pairs".
Is there a clerical error here, or does AutoFlow only open source partial of the data?