manim icon indicating copy to clipboard operation
manim copied to clipboard

Let `SceneFileWriter` access `ffmpeg` via `av` instead of via external process

Open behackl opened this issue 1 year ago โ€ข 11 comments

Potentially, this is a pretty substantial improvement to the overall usability of Manim:

  • av provides wheels for a lot of architectures that come with a statically linked version of ffmpeg (-> users would no longer need to install it on their own)
  • As a result, we could also throw all of the extra steps in our CI pipelines that setup ffmpeg away
  • Plus documentation build times should increase too as less setup is needed; but not only that, ...
  • Piping to an external process is potentially slower than inner-process communication; in a non-exact measurement on my local device I found OpeningManim in example_scenes/basic.py to require only around 7.7sec CPU time on the current main branch vs. 5.7sec CPU time on this branch. (Unsure whether this is significant, we'll see in the documentation build times.)

Depending on how experiments go, there is a lot more to do here:

  • [x] Refactor partial movie file creation
  • [x] Figure out how video filters work exactly (we currently need -vf vflip for OpenGL rendering ... or we solve that differently)
  • [ ] Test creation of movies with other codecs
  • [x] Refactor combination of partial movie files into final output file
  • [ ] Make sure users can also interact with the av settings somehow
  • [ ] Simplify installation documentation (simply pip install manim, plus optional LaTeX installation ... ๐Ÿ‘€)

behackl avatar Dec 07 '23 10:12 behackl

I'd gladly join forces on that, but could we share a branch where we can both edit on?

I've sent an invitation to the repo this branch lies on, behackl/manim -- as soon as you accept, you should have write permissions to the branch too. I've created a forum post in our Discord and pinged you there, just to have a place to chat about this too. ๐Ÿ‘

behackl avatar Dec 08 '23 09:12 behackl

Love the work you're doing here! โค๏ธ

MrDiver avatar Dec 08 '23 23:12 MrDiver

@behackl any idea why tests might fail in CI but not on my computer?

image

jeertmans avatar Dec 11 '23 18:12 jeertmans

@jeertmans locally you are probably on an older version of cairo, which we disabled running graphical tests for (check the output, there are ~200 skipped tests).

The tests currently seem to fail because of a mismatch in the number of generated frames in the graphical tests. Curious!

behackl avatar Dec 12 '23 09:12 behackl

Ah yes... indeed. But https://github.com/ManimCommunity/manim/blob/main/.github/scripts/ci_build_cairo.py install Cairo 1.18, which is not even available from apt (Ubuntu 22.04). The latest version available is 1.16.

jeertmans avatar Dec 12 '23 10:12 jeertmans

Ah yes... indeed. But https://github.com/ManimCommunity/manim/blob/main/.github/scripts/ci_build_cairo.py install Cairo 1.18, which is not even available from apt (Ubuntu 22.04). The latest version available is 1.16.

I have raised an issue about that (#3521), as I think this may be good to document this.

The tests currently seem to fail because of a mismatch in the number of generated frames in the graphical tests. Curious!

Yeah I don't know why it occurs... I have updated the reference frames (not sure if this is the right solution) in https://github.com/ManimCommunity/manim/pull/3501/commits/904cfb46ae8db7dfac01036671991cf81a7823de.

I tested the new manim in a large project I did, and encountered some rendering issue, so will reach back to you when I (hopefully) fixex them!

jeertmans avatar Dec 12 '23 11:12 jeertmans

I'll investigate this locally.

behackl avatar Dec 12 '23 11:12 behackl

Rendering for non-standard output formats needs some work; I believe that the -c copy flag currently is just getting ignored. The .mov generated when rendering with -t appears to be broken.

behackl avatar Jan 31 '24 12:01 behackl

@behackl maybe that might be good to save video files in various format with the release version of Manim, and compare the output video files with this new version (this is assuming that everything was done correctly in prior versions of Manim)

jeertmans avatar Jan 31 '24 13:01 jeertmans

@behackl maybe that might be good to save video files in various format with the release version of Manim, and compare the output video files with this new version (this is assuming that everything was done correctly in prior versions of Manim)

Indeed, I am not even sure in how far correctness of generated files is tested currently. I am thinking of adding some video tests (using the pyav-equivalent of ffprobe) to check that things like the generated format + codec etc. is actually correct. I'm on this! :+1:

behackl avatar Jan 31 '24 13:01 behackl

@behackl maybe that might be good to save video files in various format with the release version of Manim, and compare the output video files with this new version (this is assuming that everything was done correctly in prior versions of Manim)

Indeed, I am not even sure in how far correctness of generated files is tested currently. I am thinking of adding some video tests (using the pyav-equivalent of ffprobe) to check that things like the generated format + codec etc. is actually correct. I'm on this! ๐Ÿ‘

Nice, perfect! Tell me if I can help (I don't have much time, but I'd love to have the release :p)

jeertmans avatar Jan 31 '24 13:01 jeertmans

I might have some time in the next few days to finish this, finally. To do so, I propose the following:

  • The task that remains here is the creation of an (ideally parameterized) test to cover all possible options affecting the renderer (mainly with respect to the different codecs and transparency). There are already a bunch of individual tests, but I am not sure that all possible options are covered. I'll look into this.
  • The tests appear to be failing due to a control data mismatch. Needs to be investigated.

After that, we should merge this: generally speaking, this PR is in a pretty good shape already with the more common rendering options working as intended.

Allowing users to pass custom options to av would be a nice-to-have / new feature anyways -- and a more radical installation instruction rewrite should also rather happen on a separate PR.

behackl avatar Apr 19 '24 23:04 behackl

Hello @behackl, this would be great if we can finalize this PR in the next days/weeks.

I would like to emphasize a few points that are, in my opinion, important to consider for this PR:

  • Fix #3648: Manim somehow produces empty videos, which are silently skipped* when concatenating because FFMPEG's CLI does a lot of cleaning, that PyAV does not do automatically. This is an error that one user of Manim Slides discovered https://github.com/jeertmans/manim-slides/issues/390 (because Manim Slides already uses PyAV) and the easiest work-around I found was to probe each file and remove files without any video stream;
  • Implement #3347 by directly writing audio to partial movies files. I know I am biased because this is a feature request of mine, but IMO that would really benefit Manim Slides, and also potentially Manim users that want to manipulate partial movie files :-)

I initially planned to create and use a separate PyAV library, to reduce to code-complexity (see https://github.com/jeertmans/avfilters), but tackling all possible video formats (especially multiple video streams, audio streams, and subtitles) can be very challenging, and is out of the scope of Manim or Manim Slides. Still, I think you may find interesting code, especially for testing if two media are the same, or else (see https://github.com/jeertmans/avfilters/blob/main/tests/utils.py).

Nevertheless, I think this is important to note that testing for equality (or close) between media is very hard, as many encodings will somehow be lossy.

*: Actually, using ffmpeg to concat four 1 second files + one empty file (see https://github.com/jeertmans/avfilters/blob/48d6f01bbb755660e5b5493d6b49a5629f228ae0/tests/test_concatenate.py#L67-L84) results in a 3.9 seconds files (the empty file seems to the

Let me know what you think :-)

jeertmans avatar Apr 20 '24 07:04 jeertmans

Hey @jeertmans!

  • Regarding #3648: we've found that these partial movie files with run_time 0 are also causing issues with the current ffmpeg CLI-based implementation. From my point of view, these files are pathological; there should not be any animations being rendered with a run_time of 0 seconds. There has been some effort (#3491) to disallow creating these files, and I am open to introducing more restrictions if necessary (I'll check whether the problem reported in manim-slides persists with #3491).

  • As for #3347, it looks to me as if the main problem with an implementation of this functionality -- the requirement of a mechanism that cuts audio of tracks playing across multiple partial movie files and redistributes the pieces correctly -- is unrelated to this PR. In the spirit of getting the av migration done, I'd rather declare this as out of scope for this PR. I'm not against the feature itself though, I can see how it is useful to have the audio in the partial movie files.

Thanks for the pointer to jeertmans/avfilters! You are, of course, right -- video tests are inherently hard. I'm already more or less satisfied with proper metadata tests (which you provided an initial rewritten version via av of on this branch here already :-)). Moreover, we have our frame comparison tests still in place anyways.

behackl avatar Apr 20 '24 08:04 behackl

Hey @jeertmans!

Thanks for you comprehensive reply, let me address the two points separately.

I wasnโ€™t aware of #3491 and this is, to me, a valid solution to #3648. The idea of just that we need to report that error to the user in a meaningful way, and not a libav error :) Which should now be the case here.

  • As for Add audio in partial_movie_files, not at the endย #3347, it looks to me as if the main problem with an implementation of this functionality -- the requirement of a mechanism that cuts audio of tracks playing across multiple partial movie files and redistributes the pieces correctly -- is unrelated to this PR. In the spirit of getting the av migration done, I'd rather declare this as out of scope for this PR. I'm not against the feature itself though, I can see how it is useful to have the audio in the partial movie files.

Sure, adding audio to each segment might not be an easy feature, and probably needs to rewrite the actual thing. If you think this is not part of this PRโ€™s, which I sort of agree, then I donโ€™t mind creating another PR for that feature, once this one is closed (as the audio feature will also use av). Nonetheless, I just wanted to point that so we can keep in mind that you might add an audio stream in the future, and so hopefully we do not write code that is against that :)

jeertmans avatar Apr 20 '24 11:04 jeertmans

  • there is a nice parametrized test for video metadata given different formats and cli-options now. not all formats (e.g., gif) are tested yet.
  • the ubuntu runners apparently segfault ๐Ÿ˜ข
  • transparent vp9-encoded files are currently broken as the new test reveals.

regardless, progress is being made here!

behackl avatar Apr 25 '24 00:04 behackl

Good news: the pipeline looks already much better.

Still to test:

  • --format=gif
  • Scenes with included audio

Bad news: something weird is going on with transparency in webm-files. It appears that this branch produces partial video files that are actually transparent, but this transparency is lost as soon as the files are recombined to the final output.

Indeed, even though we are using the yuva420p pixel format to save the partial movie files, opening such a file with av and inspecting the pixel format of the video stream returns

>>> container = av.open("/home/behackl/code/manim/example_scenes/media/videos/bla/720p30/partial_movie_files/StarScene/4117703622_1377611505_3088195676.webm")
>>> container.streams[0].pix_fmt
'yuv420p'

I am not quite sure how to salvage this yet.

behackl avatar Apr 25 '24 14:04 behackl

With the change in 4c458366, manim correctly produces transparent vp9-encoded (webm) files, verified by inspection -- which, however, I can't test because I can't seem to find a way to extract the first frame with pyav in a way that exposes the alpha channel.

(And conveniently, calling VideoFrame.to_ndarray(format="rgba") on a frame extracted from the video causes a segmentation fault...)

behackl avatar Apr 25 '24 15:04 behackl

With the change in 4c45836, manim correctly produces transparent vp9-encoded (webm) files, verified by inspection -- which, however, I can't test because I can't seem to find a way to extract the first frame with pyav in a way that exposes the alpha channel.

(And conveniently, calling VideoFrame.to_ndarray(format="rgba") on a frame extracted from the video causes a segmentation fault...)

Great work, thanks!! Here is what I found about transparency with PyAV/ffmpeg:

  • https://github.com/PyAV-Org/PyAV/issues/500#issuecomment-513621698
  • and the first comment in https://www.reddit.com/r/ffmpeg/s/fgdOzMiuVZ
  • https://github.com/PyAV-Org/PyAV/issues/1269#issuecomment-1918635446

Maybe this helps?

jeertmans avatar Apr 25 '24 17:04 jeertmans

It did indeed help a little, thanks! But there is still a bit of mystery left. If I extract the first frame using

from av.codec.context import CodecContext
context = CodecContext.create("libvpx-vp9", "r")
packet = next(container.demux(video=0))
first_frame = context.decode(packet)[0].to_ndarray()

then I actually get the correct RGBA-values for the pixel in the upper left corner -- however, the red pixel in the bottom right corner now yields (255, 252, 0, 0) (in ARGB) instead of (255, 255, 0, 0); the red-value is slightly wrong now. Maybe this is some sort of color space issue. I'll play around with this some more.

behackl avatar Apr 25 '24 19:04 behackl

It did indeed help a little, thanks! But there is still a bit of mystery left. If I extract the first frame using

from av.codec.context import CodecContext
context = CodecContext.create("libvpx-vp9", "r")
packet = next(container.demux(video=0))
first_frame = context.decode(packet)[0].to_ndarray()

then I actually get the correct RGBA-values for the pixel in the upper left corner -- however, the red pixel in the bottom right corner now yields (255, 252, 0, 0) (in ARGB) instead of (255, 255, 0, 0); the red-value is slightly wrong now. Maybe this is some sort of color space issue. I'll play around with this some more.

Nice! Yes I also observed that it is hard to get the same pixel values, probably due to color space or maybe lossy encodings. This is why is use assert_all_close :)

jeertmans avatar Apr 25 '24 20:04 jeertmans

  • gifs are now covered by the parametrized test as well,
  • for mp4 + mov audio works (just tested locally, not yet covered by the test)
  • TODO: fix rendering the audio track for --format=webm

behackl avatar Apr 27 '24 22:04 behackl

Just a reminder to update with main and change the CI for MacOS-13 to MacOS-latest

JasonGrace2282 avatar Apr 28 '24 18:04 JasonGrace2282

I did change the pipeline to macos-latest again, but it seems that there are now problems with pango. We might want to delay having a working macos-latest pipeline until after this PR has been merged and deal with this in a follow-up PR.

behackl avatar Apr 29 '24 13:04 behackl

  • gifs are now covered by the parametrized test as well,
  • for mp4 + mov audio works (just tested locally, not yet covered by the test)
  • TODO: fix rendering the audio track for --format=webm

@behackl Can I help you with one or more?

jeertmans avatar Apr 30 '24 09:04 jeertmans

@jeertmans thanks, I have some time today to work on this today, I'm somewhat optimistic that this can be finished. :-)

Current state of things: In 5fe5bb5 I've produced some code that works in principle:

  • We no longer attempt to add an audio stream to gifs,
  • Audio for webm files works, but the current implementation (where an pydub.AudioSegment is exported to an opus file) unfortunately requires ffmpeg. I'll investigate whether we can just write the audio files to .wav files and reencode them internally using pyav again.

I'd then like to modify the codec test to check for the presence of a non-empty audio track, but that should be comparatively easy as soon as the pipeline runs. Will keep sending updates here.

behackl avatar May 01 '24 11:05 behackl

All of our file formats work as intended locally; I've split the codec test, there now is a separate one for gifs. For webm, mov, mp4 the parametrized test explicitly checks whether there is a non-silent audio track.

I think this is in a sufficiently final state to be reviewed / merged. I'd like to completely revamp our installation instructions in a follow-up. We should

  • Significantly simplify the instructions for different operating systems,
  • Investigate the required steps for installing in case there is no binary wheel for pyav for some particular OS/Python version combination, and document them.

behackl avatar May 01 '24 19:05 behackl

Hey, overall amazing job here on this PR! I believe this will make Manim much easier to use for the average user :) Seeing as the tests passed (including the new ones), I tried to "break" the rendering system and came up with a case where it fails. Hopefully :crossed_fingers: it shouldn't be too hard to fix :) Other than that, LGTM! Thanks for your work on this PR :)

Small Wait time

def construct(self):
    m = ManimBanner()
    self.play(m.expand())
    self.wait(1e-9) # this fails

JasonGrace2282 avatar May 02 '24 01:05 JasonGrace2282

Hey, overall amazing job here on this PR! I believe this will make Manim much easier to use for the average user :) Seeing as the tests passed (including the new ones), I tried to "break" the rendering system and came up with a case where it fails. Hopefully ๐Ÿคž it shouldn't be too hard to fix :) Other than that, LGTM! Thanks for your work on this PR :)

Small Wait time

def construct(self):
    m = ManimBanner()
    self.play(m.expand())
    self.wait(1e-9) # this fails

Hello, does it fail to render (I.e., an error is raised) or just fails to add the wait time?

jeertmans avatar May 02 '24 05:05 jeertmans

It full on fails. Good find!

Stack trace
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /Users/behackl/code/manim/manim/cli/render/commands.py:120 in render                             โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   117 โ”‚   โ”‚   โ”‚   try:                                                                           โ”‚
โ”‚   118 โ”‚   โ”‚   โ”‚   โ”‚   with tempconfig({}):                                                       โ”‚
โ”‚   119 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   scene = SceneClass()                                                   โ”‚
โ”‚ โฑ 120 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   scene.render()                                                         โ”‚
โ”‚   121 โ”‚   โ”‚   โ”‚   except Exception:                                                              โ”‚
โ”‚   122 โ”‚   โ”‚   โ”‚   โ”‚   error_console.print_exception()                                            โ”‚
โ”‚   123 โ”‚   โ”‚   โ”‚   โ”‚   sys.exit(1)                                                                โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/behackl/code/manim/manim/scene/scene.py:239 in render                                     โ”‚
โ”‚                                                                                                  โ”‚
โ”‚    236 โ”‚   โ”‚   โ”‚   return True                                                                   โ”‚
โ”‚    237 โ”‚   โ”‚   self.tear_down()                                                                  โ”‚
โ”‚    238 โ”‚   โ”‚   # We have to reset these settings in case of multiple renders.                    โ”‚
โ”‚ โฑ  239 โ”‚   โ”‚   self.renderer.scene_finished(self)                                                โ”‚
โ”‚    240 โ”‚   โ”‚                                                                                     โ”‚
โ”‚    241 โ”‚   โ”‚   # Show info only if animations are rendered or to get image                       โ”‚
โ”‚    242 โ”‚   โ”‚   if (                                                                              โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/behackl/code/manim/manim/renderer/cairo_renderer.py:268 in scene_finished                 โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   265 โ”‚   def scene_finished(self, scene):                                                       โ”‚
โ”‚   266 โ”‚   โ”‚   # If no animations in scene, render an image instead                               โ”‚
โ”‚   267 โ”‚   โ”‚   if self.num_plays:                                                                 โ”‚
โ”‚ โฑ 268 โ”‚   โ”‚   โ”‚   self.file_writer.finish()                                                      โ”‚
โ”‚   269 โ”‚   โ”‚   elif config.write_to_movie:                                                        โ”‚
โ”‚   270 โ”‚   โ”‚   โ”‚   config.save_last_frame = True                                                  โ”‚
โ”‚   271 โ”‚   โ”‚   โ”‚   config.write_to_movie = False                                                  โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/behackl/code/manim/manim/scene/scene_file_writer.py:452 in finish                         โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   449 โ”‚   โ”‚   if write_to_movie():                                                               โ”‚
โ”‚   450 โ”‚   โ”‚   โ”‚   if hasattr(self, "writing_process"):                                           โ”‚
โ”‚   451 โ”‚   โ”‚   โ”‚   โ”‚   self.writing_process.terminate()                                           โ”‚
โ”‚ โฑ 452 โ”‚   โ”‚   โ”‚   self.combine_to_movie()                                                        โ”‚
โ”‚   453 โ”‚   โ”‚   โ”‚   if config.save_sections:                                                       โ”‚
โ”‚   454 โ”‚   โ”‚   โ”‚   โ”‚   self.combine_to_section_videos()                                           โ”‚
โ”‚   455 โ”‚   โ”‚   โ”‚   if config["flush_cache"]:                                                      โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/behackl/code/manim/manim/scene/scene_file_writer.py:671 in combine_to_movie               โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   668 โ”‚   โ”‚   โ”‚   return                                                                         โ”‚
โ”‚   669 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   670 โ”‚   โ”‚   logger.info("Combining to Movie file.")                                            โ”‚
โ”‚ โฑ 671 โ”‚   โ”‚   self.combine_files(                                                                โ”‚
โ”‚   672 โ”‚   โ”‚   โ”‚   partial_movie_files,                                                           โ”‚
โ”‚   673 โ”‚   โ”‚   โ”‚   movie_file_path,                                                               โ”‚
โ”‚   674 โ”‚   โ”‚   โ”‚   is_gif_format(),                                                               โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/behackl/code/manim/manim/scene/scene_file_writer.py:636 in combine_files                  โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   633 โ”‚   โ”‚   โ”‚   โ”‚   output_container.mux(packet)                                               โ”‚
โ”‚   634 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   635 โ”‚   โ”‚   else:                                                                              โ”‚
โ”‚ โฑ 636 โ”‚   โ”‚   โ”‚   for packet in partial_movies_input.demux(partial_movies_stream):               โ”‚
โ”‚   637 โ”‚   โ”‚   โ”‚   โ”‚   # We need to skip the "flushing" packets that `demux` generates.           โ”‚
โ”‚   638 โ”‚   โ”‚   โ”‚   โ”‚   if packet.dts is None:                                                     โ”‚
โ”‚   639 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   continue                                                               โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ in demux:166                                                                                     โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ in av.container.core.Container.err_check:286                                                     โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ in av.error.err_check:328                                                                        โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
FileNotFoundError: [Errno 2] No such file or directory:
'/Users/behackl/code/manim/example_scenes/media/videos/bug/720p30/partial_movie_files/BugScene/partial_movie_file_list.t
xt'; last error log: [concat] Impossible to open
'file:/Users/behackl/code/manim/example_scenes/media/videos/bug/720p30/partial_movie_files/BugScene/543634251_3305253770
_1995303018.mp4'

behackl avatar May 02 '24 07:05 behackl