PyAV
PyAV copied to clipboard
`frame.to_ndarray(format='rgba')` can cause a crash
Overview
I am trying to process the following file:
The webm file should contain 152 frames, however pyav can only read 23 frames:
import av
from av.codec.context import CodecContext
in_f = '026.webm'
frames = []
context = CodecContext.create('libvpx-vp9', 'r')
with av.open(in_f) as container:
for packet in container.demux(video=0):
for frame in context.decode(packet):
frames.append(frame)
print(len(frames))
Moreover, the following code cause crash:
import av
from av.codec.context import CodecContext
in_f = '026.webm'
frames = []
with av.open(in_f) as container:
context = container.streams.video[0].codec_context # Crash here
The following code also cause crash:
import av
from av.codec.context import CodecContext
in_f = '026.webm'
frames = []
context = CodecContext.create('libvpx-vp9', 'r')
with av.open(in_f) as container:
for packet in container.demux(video=0):
for frame in context.decode(packet):
frame_nd = frame.to_ndarray(format='rgba') # Crash here
frames.append(frame_nd)
The crash message on Linux:
malloc(): corrupted top size
Aborted (core dumped)
Note that the problem of not reading all frames is present on Windows, MacOS (arm64) and Arch Linux. The crash occurs on Windows and Arch Linux only, the crash does not occur on macOS arm64.
Using ffmpeg on command line is able to demux and convert the webm file normally.
Expected behavior
All frames of webm file successfully are read without crash
Actual behavior
Not all frames of webm file are successfully read or/and crash
Versions
Tested on Windows, MacOS arm64 Sonoma, Arch Linux.
Both av and pyav (Contains newer version of pyav) wheels from pypi were tested
Research
I have done the following:
- [X] Checked the PyAV documentation
- [X] Searched on Google
- [X] Searched on Stack Overflow
- [X] Looked through old GitHub issues
- [ ] Asked on PyAV Gitter
- [ ] ... and waited 72 hours for a response.
None of my media players would play anything beyond 23 frames. I even tried playing 026.webm using ffplay from the latest commit in master (9949c1dd78) and it still only played 23 frames.
Since PyAV uses ffmpeg as its source of truth, iterating over 23 frames is the correct behavior.
I can verify those crashes on Windows, but since it appears that ffmpeg library is crashing, rather than PyAV itself, I don't think there is anything we can do.
Actually I counted the number of frames by ffmpeg -i 026.webm frames/%03d.png and it shows 152 frames.
I checked again using ffmpeg -i 026.webm test.webm and it shows 23 frames.
The code I given should iterate all frames, including non-keyframes, right?
ffmpeg library is crashing
But it does not crash if I invoke ffmpeg from command line, so could this mean a problem with how pyav is binding to ffmpeg?
That command, ffmpeg -i 026.webm frames/%03d.png is implemented I believe in ffmpeg.c which is separate from the library (libavcodec, libavformat, etc). That means that ffmpeg commands are not really comparable with PyAV.
When I ran that command, I did get 152 frames, however, each of those frames are repeated 5-8 times. When I run the equivalent for PyAV, it only makes the 23 unique frames. I believe that PyAV has the correct behavior here.
If ffmpeg 5.1 or ffmpeg 6.0 cli crash when running on this input, then there is certainly nothing PyAV can do, at least until PyAV drops support for ffmpeg 5.x. If the cli or better yet, a C program demonstrate that it can do the equivalent without crashing, then there may be something PyAV can do, although answering what will be difficult.
Got it. Should we report the crash in Windows and Linux to ffmpeg? I am not sure how to report though as the problem is within the library, not the command itself.
I end up with this working code:
import av
from av.codec.context import CodecContext
in_f = '026.webm'
frames = []
context = CodecContext.create('libvpx-vp9', 'r')
with av.open(in_f) as container:
for packet in container.demux(stream):
for frame in context.decode(packet):
if frame.width % 2 != 0:
width = frame.width - 1
else:
width = frame.width
if frame.height % 2 != 0:
height = frame.height - 1
else:
height = frame.height
# print(frame.format.name) # yuva420p
frame = frame.reformat(width=width, height=height, format='yuva420p', dst_colorspace=1)
y = useful_array(frame.planes[0]).reshape(height, width)
u = useful_array(frame.planes[1]).reshape(height // 2, width // 2)
v = useful_array(frame.planes[2]).reshape(height // 2, width // 2)
a = useful_array(frame.planes[3]).reshape(height, width)
u = u.repeat(2, axis=0).repeat(2, axis=1)
v = v.repeat(2, axis=0).repeat(2, axis=1)
y = y.reshape((y.shape[0], y.shape[1], 1))
u = u.reshape((u.shape[0], u.shape[1], 1))
v = v.reshape((v.shape[0], v.shape[1], 1))
a = a.reshape((a.shape[0], a.shape[1], 1))
yuv_array = np.concatenate((y, u, v), axis=2)
yuv_array = yuv_array.astype(np.float32)
yuv_array[:, :, 0] = yuv_array[:, :, 0].clip(16, 235).astype(yuv_array.dtype) - 16
yuv_array[:, :, 1:] = yuv_array[:, :, 1:].clip(16, 240).astype(yuv_array.dtype) - 128
convert = np.array([#[1.164, 0.000, 1.793],[1.164, -0.213, -0.533],[1.164, 2.112, 0.000]
[1.164, 0.000, 2.018], [1.164, -0.813, -0.391],[1.164, 1.596, 0.000]
])
rgb_array = np.matmul(yuv_array, convert.T).clip(0,255).astype('uint8')
rgba_array = np.concatenate((rgb_array, a), axis=2)
frames.append(rgba_array)
Reference: https://stackoverflow.com/questions/72308308/converting-yuv-to-rgb-in-python-coefficients-work-with-array-dont-work-with-n
After some testing, I think the crash in second case occurs in this line in av/video/frame.pyx. It calls useful_array, and crash at this line. The reshape(-1) had caused the crash. I think this mean the reformat() in this line is not returning correct rgba array back, and the crash is NOT occuring in ffmpeg. Would adding support for yuva420p solve the problem?
I am currently experiencing the same problem. A crash occurs while translating yuva to rgba. The first frame is converted without a problem, but it crashes immediately on the next frame. State management issue in codec context?
To pinpoint the error, it's reformatting YUVA420P format to RGBA frames, then one frame of that reformat gets garbage collected, and then the next YUVA420P reformat back to RGBA frames crashes. In other words, it seems that the YUVA420P to RGBA converted frames are not being freed properly (or duplicate freed). I'm not sure if this is exactly the same issue.
Below is the code to reproduce this.
from av import VideoFrame
import gc
frame = VideoFrame(width=642, height=640, format="yuva420p")
a = frame.reformat(format="rgba")
del a
gc.collect()
a = frame.reformat(format="rgba") # crashed
This problem does not occur when the width of the VideoFrame is divided by 16.
from av import VideoFrame
import gc
frame = VideoFrame(width=656, height=640, format="yuva420p")
a = frame.reformat(format="rgba")
del a
gc.collect()
a = frame.reformat(format="rgba") # not crashed
I don't have any knowledge of video encoding and decoding, so I can't fix this, but I've tried to reproduce the error in more detail because I'd like to see this resolved.
After playing around with different build environments, I found that the issue occurs when SSSE3 is enabled (building with the --disable-ssse3 option does not cause this issue).
I googled it and found the following issues. https://trac.ffmpeg.org/ticket/9254
If this issue is not directly reflected in FFMPEG, it seems that the only way to avoid the error is to add appropriate padding.