Open Issue of Motion Estimation
Hi, I'm doing something MEMC works. Great job on the occulusion problem solving. But still found some artifacts with even newest trained RIFE v2.4
I would very much appreciate it if get some suggestions.
- Reason to cut off motion relation (only 2 frames) instead of recursive inertial motion prior? (like 3DRS)
- Why the V2 model using stacked Conv instead of ResBlock?
- How/Why the double flow (V1 is F and -F) work? For better non-linear motion?
- Periodic Texture / Uniform Region artifacts
- High frequency static (or small motion) texture artifacts.
- The traditional approach is mark these areas (using auto-correlation or frequency analysis ) and converge motion from outside region.
- Aperture Problem artifacts
- For small piece of object with large relation with background motion, the small object motion is disappeared.
- Fading scene change artifacts
- How to deal with blending scene change without top-detected strategy, like cross-scene fading?
- Logo Protection
- Some content OSD (like right-top 50%-transparent logo) flickering.
- stacked Conv run 20% faster than Resblock with better performance in video interpolation. In low level tasks, the experience of model design seems to be very different. This bothers me very much. I have recently read many papers on model design, but there is still no progress.
- We noticed that sometimes objects will change in size, and it is harmful to restrict the NN to handle such changes.
- I think the idea of combining The traditional approach is correct, but I haven't learned these methods yet.
- For the improvement of model performance, we have found some solutions, and it may take two months to organize and develop new algorithms.
- Logo Protection is an interesting question, I will do some research.
Suffer from these problems also!
Supplement:
-
Periodic texture(I name it repetitive pattern problem): Besides the aforemetioned cases(static or small motion case), the problem occurs also when motorial object goes through the repetitive pattern region, especially in the motion edge. It's also a chellenging problem in motion estimation. Actually, I try cnn-based(like, pwc, liteflownet3, raft and so on) and traditional motion estimation algorithms and both of them meet the same problem, more or less. Take LiteFlowNet3 for example.
-
OSD content: logo/subtitle/or other type static text on video content with large motion.
Actually, I also find that RIFE is more robust than other interpolation algorithm on periodic texture problem although rife also suffers from it. I have not figured it out yet.