DeOldify icon indicating copy to clipboard operation
DeOldify copied to clipboard

Make continuing video processing easier

Open bitplane opened this issue 1 year ago • 2 comments

  1. Increase the number of digits for frames so they sort easily
  2. Process frames in alphanumeric order, rather than arbitrary order decided by the filesystem.
  3. Add a "restart" option which won't download the video again, and will skip frames that have already been processed.
  4. Put some log messages in so that you can see what step it's up to.

So if you're worried about running out of time on colab, you can tar up the dest images every couple of hours and move them over to drive, restart the session, copy them back in and continue.

bitplane avatar Sep 10 '22 20:09 bitplane

Hi Jason, sounds great, thanks for the feedback.

I'll keep hacking away at it in another branch while I'm processing videos and merge here when it's ready for review again, rather than keep force-pushing here and bugging you with notifications! :D

The reason I used restart is because the most natural word to use would be "continue", but that's a reserved word in Python. The convention is to stick an underscore on the end of it like continue_, but that seems horrible - specially in the notebook world where parameter inspection isn't that great. Maybe calling it "resume" instead?

Now I know you're interested in changes I'll have a deeper think about how to make it a bit more beautiful. I'm thinking ~~that since we've got a class I should probably put the flags on the instance rather than pass them as parameters,~~ edit: I misread, scratch that... and add some proper docstrings, fix the unsafe os.system calls, use f-strings etc.

I've also had a little hack around with skipping frame extraction but I don't like the way I've implemented it at present, there's a few edge cases to do with restarting after deleting images.

The other things that are bugging me:

  1. I'm wasting a lot of time on Colab just building the video itself. I could tar the frames up and push them into drive while using the GPU to process another video, and do the rest on my CPU at home. So I might also make the actual video combination step optional. But it'd need some thought around properly managing the dependencies.
  2. There's too many files in a dir to be practical in Jupyter's web UI; it doesn't like more than a thousand and chokes when clicking around to look at the results. I think it'd make sense to break outputs into dirs like "012345000/012345678.jpg"
  3. It feels like the global interpreter lock / single threading is killing me locally. I'm getting about 70% GPU usage according to nvtop while I've got one CPU core running hot - I assume it's processing JPEGs or PIL creating images. If I get round to profiling it (I love performance engineering) I'll raise another PR with changes for that; there could be a +40% perf win there.

bitplane avatar Sep 13 '22 23:09 bitplane

@bitplane That all sounds fantastic. Yeah as far as performance goes- I've had to do a lot of optimization of our more advanced video stuff in our commercial work with my DeOldify startup. Both speed and memory actually- doing everything in memory is quite a challenge, for example. It's a pretty deep rabbit hole to get ffmpeg pipelines optimized and there's a lot of low hanging fruit. I haven't been in a rush to try optimizing too much for this stuff but please if you want to go ahead! I can say that the less compression you use on frames, the faster it is and that's potentially an easy win, but you can use up disk space very quickly. Especially with these Colabs. Another potential win is using grayscale formats when applicable.

Oh and "resume" sounds like great wording.

jantic avatar Sep 17 '22 02:09 jantic

Hey sorry, just a quick update - I've neglected this for a bit while I'm working on something else, but I will get back to it sometime this week, hopefully tomorrow or thursday

bitplane avatar Sep 27 '22 13:09 bitplane

Sorry it took so long! I've started a new contract and it's been pretty intense over the first month. Finally got around to fixing the parameter changes though, and also added a few more logging things. I've not been using it recently so might have missed some things. I did give it a quick manual test, but I'm a feeble bag of flesh and water; it could do with some test automation!

I'll hopefully have time to do some more old movies soon and look at some perf improvements :)

bitplane avatar Nov 05 '22 00:11 bitplane

Oops sorry I forgot the re-downloading check. Hold fire, I'll deal with that ASAP.

bitplane avatar Nov 06 '22 21:11 bitplane

Oops sorry I forgot the re-downloading check. Hold fire, I'll deal with that ASAP.

There hasn't been any activity after this. Want to go ahead with this or do you want me to close this out?

jantic avatar Feb 11 '23 23:02 jantic