unbuffered reads from ffmpeg
I don't have time to test and to work on this much, but I think that the code to read in data from ffmpeg can be optimized slightly.
The inefficiencies probably stem from the call to read_n_bytes, which uses reads in a string (an immutable type), then converts to a numpy buffer (by copying the memory).
I found that for my 1048x1328x3 frames, it was able to speed things from reading in a tight loop at 283 fps to 293 fps.
Marginal gain, but maybe somebody needs it. Maybe this can make a bigger difference depending on the workload/hardware/decoding process.
Here is a rough sketch of the patch. You need to set the input to unbuffered see bug report on numpy below.
Patch skeleton
diff --git a/imageio/plugins/ffmpeg.py b/imageio/plugins/ffmpeg.py
index 83f9a9f..5d81546 100644
--- a/imageio/plugins/ffmpeg.py
+++ b/imageio/plugins/ffmpeg.py
@@ -471,7 +471,8 @@ class FfmpegFormat(Format):
# For Windows, set `shell=True` in sp.Popen to prevent popup
# of a command line window in frozen applications.
self._proc = sp.Popen(
- cmd, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE, shell=ISWIN
+ cmd, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE, shell=ISWIN,
+ bufsize=0,
)
# Create thread that keeps reading from stderr
@@ -611,7 +612,9 @@ class FfmpegFormat(Format):
w, h = self._meta["size"]
framesize = self._depth * w * h * self._bytes_per_channel
assert self._proc is not None
-
+ s = np.fromfile(self._proc.stdout, dtype=np.uint8,
+ count=framesize)
+ return s, True
try:
# Read framesize bytes
if self._frame_catcher: # pragma: no cover - camera thing
@@ -644,8 +647,9 @@ class FfmpegFormat(Format):
w, h = self._meta["size"]
# t0 = time.time()
s, is_new = self._read_frame_data()
- result = np.frombuffer(s, dtype=self._dtype).copy()
- result = result.reshape((h, w, self._depth))
+ result = s.reshape(h, w, self._depth)
+ # result = np.frombuffer(s, dtype=self._dtype).copy()
+ # result = result.reshape((h, w, self._depth))
# t1 = time.time()
# print('etime', t1-t0)
I'm not too sure of the other performance implications of buffered vs unbuffered reads. Investigating that will take more time than I have. Maybe other parts of the code can benefit from this kind of stuff.
https://github.com/numpy/numpy/issues/12309
Note that after imageio/imageio#425 most changes will apply to imageio_ffmpeg. But I can imagine you need some (numpy) magic in imageio as well, so leaving the issue here for now.
Transferred this issue from imageio to imageio-ffmpeg. Could this be as simple as adding a bufsize arg to our functions?
I tried a while back. Not too sure why it didn't work.
It may have been something to do with numpy to be honest. I had a pR in flight with them that I never got the tests to pass for
I'm going to close this as I think that users that require this additional performance from python should likely be looking at new pyav plugin.