XVFI
XVFI copied to clipboard
Inferencing on a different dataset
How can we do inference on different video dataset. For example, for different resolution images. Seems only the default datasets are supported for inferencing for now ?
Also, would like to know how to do only 2x interpolation. Seems that is not supported ?
I'll add another related question: what padding should we do for handling arbitrary resolutions, and is the padding the same for the two models (trained on the XVFI dataset and trained on Vimeo90k)?
Hi, thank you for your interests @rsjjdesj , @sniklaus .
-
"How can we do inference on different video dataset. " => We have added a new option for test_custom, that enables to test on a custom dataset. Thank you for your request.
-
"Also, would like to know how to do only 2x interpolation. Seems that is not supported ?" => Sorry for the restricted option, now available for x2 also. We have modified the option for '--multiple' in the parser.
-
"what padding should we do for handling arbitrary resolutions, and is the padding the same for the two models (trained on the XVFI dataset and trained on Vimeo90k)?" => The input size for our network depends on the '--S_tst' and '--module_scale_factor'. Therefore, padding rules are different for the models trained on X-TRAIN and Vimeo90K. There is also the description in the paper :
For this, we train XVFI-Net variants by fully utilizing 512×512-sized patches because the spatial resolution of the training inputs should be multiple of 512 for S_trn = 5 where the number 512 is determined as 2^{S_trn = 5} × M (= 4) × 4 (via the bottlenecks of the autoencoders).
We have reflected this issue as https://github.com/JihyongOh/XVFI/blob/484bdea1448c22459b10548a488909c268e1dde9/main.py#L333-L343 Please use the updated codes (main.py, utils.py) for the inference. Thank you :)
Thanks for adding these features. For inferencing 1080p images (1920x1080), could you recommend which pre-trained model should we use: the 4K one, or the Vimeo one based on your experiments.
Also, have you tested the model to infer animation content.
@rsjjdesj
- Based on the experiment results of Table 2 and analyses on the adjustable scalability related to Adobe240fps dataset we used (1280x720 HD), we think the XVFI-Net trained on X-TRAIN (4K one) would provide better results on 1080p images. We recommend you to carefully regulate S_tst among (2,3,4,5), which is also the advantage of our model.
- No, we have not tested our model on any animation content but it seems a valuable try.
When i test on custom dataset, I restored the specified directory structure, but there still seems to be a problem. English is not my native language. If there are any errors or improper descriptions, I apologize.
@98mxr Thank you for your interest. May you tell us about your problem in detailed with a screen-captured image?
Sure, I think I can illustrate the problem with a picture.
I put two pictures in ./XVFI/custom_path/scene1, and use
main.py --gpu 0 --phase test_custom --exp_num 1 --dataset X4K1000FPS --module_scale_factor 4 --S_tst 5 --multiple 8 --custom_path ./custom_path

I get nothing, even if i created ./XVFI/custom_path/scene1/custom_path/scene1/
@98mxr We did not consider that delimiter is different in Windows ('\') and Linux ('/'). Now, we have reflected those in both 'main.py' and 'utils.py', by using 'os.sep' and 'os.path.join()', instead of directly using only '/'. Please re-download both codes. Sorry for your inconvenience.
Thank you.
@98mxr We did not consider that delimiter is different in Windows ('') and Linux ('/'). Now, we have reflected those in both 'main.py' and 'utils.py', by using 'os.sep' and 'os.path.join()', instead of directly using only '/'. Please re-download both codes. Sorry for your inconvenience.
Thank you.
It works correctly now, thank you for your work and attention.