ffmpeg-kit
ffmpeg-kit copied to clipboard
FFmpeg-Kit on Android consumes excessive time and memory compared to Termux
Background: Termux is a Linux terminal emulator on the Android platform. We can install FFmpeg in it by executing pkg install ffmpeg
, and the current provided version is 6.1.1
.
Recently, I compared the performance of transcoding videos using FFmpeg via Termux and via FFmpeg-Kit and found that Termux consumes much less time and memory.
For example, on my Android 14 phone, FFmpeg in Termux transcoded a video in 44.4 seconds and consumed 604M of memory. However, using FFmpeg-Kit (implemented with com.arthenica:ffmpeg-kit-full-gpl:6.0-2.LTS
and calling FFmpegKit.executeAsync()
) for the same task required 82.1 seconds and 724M of memory. Screenshots:
To ensure this is not device-specific, I conducted tests on an Android 10 x86-64 emulator. Termux completed the task in 30 seconds, using 585M of memory, while FFmpeg-Kit took 88 seconds and utilized nearly 1GB of memory. Screenshots:
Despite various attempts to find the cause — including making sure both were using the arm64-v8a platform (or both were x86-64), importing different videos, altering bit depths, and hardware acceleration settings, as well as adjusting power options and background process limits — the huge performance gap persisted consistently.
I tested video transcoding on Windows using FFmpeg versions 6.0
and 6.1.1
and observed no significant performance differences.
So I am wondering if this is an unexpected behavior, and it would be better if FFmpeg inside FFmpeg-Kit could perform as well as Termux.
The FFmpeg-Kit project brings great convenience to Android developers like me who develop functions based on FFmpeg. I am eager to provide additional information or assistance if required. Any insights or guidance would be sincerely appreciated.
Thanks for the benchmarks. According to the screenshots:
The test on Termux
:
- Decodes an
h265
file natively by enabling hw accelerated decoders available via theauto
flag - Encodes the video stream using the
mediacodec hevc
encoder
The test on FFmpegKit
:
- Decodes a file natively. No hw accelerated decoders are used. Input format is also not visible in the screenshot
- Encodes the video stream using the
x265
encoder
Which means this test mostly compares Termux
's hevc_mediacodec
encoder with FFmpegKit
's x265
encoder.
Is seeing FFmpegKit
's x265
encoder consume two times more CPU and RAM normal? I guess so. We have an ASM Support wiki page, where we list libraries that cannot fully use CPU specific instructions. Unfortunately, x265
is there for both Android
and iOS
. We also know that there are memory leaks in x265
. So, it's not surprise to see x265
so bad.
If you have time I suggest testing the performance of the same encoder in both implementations.
I apologize for not providing a full screenshot initially. However, I am sure I am not making the wrong comparison. The command lines used in each round of comparison are exactly the same except for the output file path.
In the first round of comparison (on my Android 14 phone), command lines are visible, both used hevc(native) decoding and hevc_mediacodec (hardware accelerated) encoding. You can also see that the file size output by both is the same at 22440KB.
In the second comparison (on Android 10 emulator), both used hevc(native) decoding and libx265(software) encoding. x265 [info]
appears in the output of the command line, which proves that Termux used the software encoder libx265 here. The difference in file size may be due to different versions of libx265.
This is hevc (native) -> h264 (libx264). Termux consumes 23.7 seconds and 756M of memory; FFmpeg-Kit consumes 32.2 seconds and 896M of memory.
Thanks for updating the screenshots. Now, I see the difference for the same codec.
Well, x265
is not surprising as I said before. But the diff in x264
and mediacodec
is too much. x264
takes 50% more time,mediacodec
almost takes twice the time it spends on termux.
On one hand, it is good to have a reference library to compare our performance against. On the other hand, identifying the root cause requires extensive testing.
I'll add a task to the Roadmap
to analyse how termux
achieves that.
If you have time, I suggest testing FFmpegKit
with the following options and checking if the performance improves in any of those scenarios.
- No GUI in
FFmpegKit
test application - Commands executed via synchronous
FFmpegKit.execute()
calls - Commands executed after redirection is disabled via
FFmpegKitConfig.disableRedirection()
If you have time, I suggest testing
FFmpegKit
with the following options and checking if the performance improves in any of those scenarios.
- No GUI in
FFmpegKit
test application- Commands executed via synchronous
FFmpegKit.execute()
calls- Commands executed after redirection is disabled via
FFmpegKitConfig.disableRedirection()
I tested all three scenarios and did not seem to observe any performance improvement.
Could we customize the version of FFmpeg source code when compiling FFmpeg-Kit? If possible, maybe I can try it myself using a newer version of FFmpeg.
Okay. Thanks for checking. Well, this line defines FFmpeg
version to be compiled. You can try using a newer version there.
I noticed that the wiki page Speed Optimization mentioned that the --speed
option can increase the speed of FFmpeg operations but not set by default. Could the reason be related to this?
Also, I think the performance issues I mentioned may still need to be reproduced by anyone else to ensure that this issue is not caused by my personal compilation environment. (If it is because of my stupid mistake that this issue arose, I would feel really embarrassed to inconvenience you!🥹)
Speed
and size
are two primary concerns for FFmpegKit
users.
I previously tested the --speed
option, but I didn't observe significant improvements in my tests. Consequently, I decided not to enable it, at least to reduce size, which is still not good enough for most users.
However, these tests were conducted on older versions. It may be necessary to rerun them to reassess the situation. Unfortunately, time constraints are a significant factor for me. I am currently pressed for time, and we have limited contributions to address these issues.
I appreciate your contribution and feedback. I will make an effort to dedicate some time on termux
in the upcoming 1-2 weeks. I will share my findings here.
Today, I conducted some tests on the libx264
encoder using the largest test file from example.com. It is a 18 MB file.
I ran the following command on an arm64-v8a
device.
-y -benchmark -i example.mp4 -c:v libx264 compressed_ffmpeg_kit_full_gpl_x264.mp4
I observed a difference in memory usage. Other than that, I didn't see a significant difference between ffmpeg 6.1.1 @ termux 0.118
and ffmpeg-kit-full-gpl-6.0-2
in terms of cpu usage.
This is termux
, where FFmpeg 6.1.1
binary is compiled using NDK r26b and Android API Level 24.
bench: utime=224.977s stime=4.960s rtime=32.341s
bench: maxrss=687092kB
This is ffmpeg-kit 6.0
, compiled on NDK r22b and Android API Level 24
bench: utime=241.973s stime=5.484s rtime=33.303s
bench: maxrss=776324kB
I also repeated my tests on a local ffmpeg-kit 6.0
binary compiled on NDK r26d. There is a very small improvement in cpu usage. But, it is nowhere near the difference you observed in your tests.
bench: utime=219.950s stime=4.762s rtime=31.421s
bench: maxrss=777468kB
In recent days, I've been reflecting on and investigating this issue extensively, conducting numerous tests in an attempt to identify the cause behind these test results.
Upon reviewing the tests I've conducted, I realized that all the video samples used in my tests were recorded using my phone. This is because I'm trying to develop an Android application to compress videos shot on my phone.
When testing with my own recorded videos, visible performance differences were evident regardless of the encoder used. However, after receiving your response, I attempted testing with video samples downloaded from the internet and obtained results similar to yours - no noticeable performance differences during transcoding.
This prompted me to consider that the issue might lie in decoding performance. Videos from the internet are typically compressed and easier to decode, whereas videos recorded on my phone usually have higher resolution and bitrate, making decoding performance the true bottleneck.
Therefore, I selected some videos I recorded and others downloaded from the internet, and tested decoding performance using the FFmpeg -hide_banner -benchmark -an -i <input.mp4> -f -null -
command. These tests were conducted on my Android phone. (Snapdragon 8 Gen 2, arm64-v8a, API 34, 12GB RAM)
List of files used for testing:
file name | size | codec | bit_rate | resolution | pix_fmt | color_space | source |
---|---|---|---|---|---|---|---|
VID_20240411_190238_.mp4 | 42.3MB | hevc | 15724862 | 1920*1080 | yuv420p | bt709 | shot by me, uploaded to Google Drive |
VID_20240410_174528_HDR10PLUS_.mp4 | 139.4MB | hevc | 38935277 | 3840*2160 | yuv420p10le | bt2020nc | shot by me, uploaded to Google Drive |
file_example_MP4_1920_18MG.mp4 | 17.0MB | h264 | 4486713 | 1920*1080 | yuv420p | bt709 | downloaded from file-examples.com |
1918465-uhd_3840_2160_24fps.mp4 | 46.0MB | h264 | 25236664 | 3840*2160 | yuv420p | bt709 | downloaded from pexels.com |
Test result:
file name | label | utime(s) | stime(s) | rtime(s) | maxrss(kB) |
---|---|---|---|---|---|
VID_20240411_190238_.mp4 | Termux | 15.648 | 0.498 | 8.283 | 152632 |
VID_20240411_190238_.mp4 | FFmpeg-Kit | 32.697 | 0.538 | 17.049 | 267496 |
VID_20240411_190238_.mp4 | compare | +109% | +8% | +106% | +75% |
VID_20240410_174528_HDR10PLUS_.mp4 | Termux | 83.827 | 0.977 | 56.333 | 565168 |
VID_20240410_174528_HDR10PLUS_.mp4 | FFmpeg-Kit | 148.439 | 2.138 | 101.33 | 644808 |
VID_20240410_174528_HDR10PLUS_.mp4 | compare | +77% | +119% | +80% | +14% |
file_example_MP4_1920_18MG.mp4 | Termux | 6.215 | 0.544 | 1.243 | 146376 |
file_example_MP4_1920_18MG.mp4 | FFmpeg-Kit | 7.329 | 0.447 | 1.441 | 257008 |
file_example_MP4_1920_18MG.mp4 | compare | +18% | -18% | +16% | +76% |
1918465-uhd_3840_2160_24fps.mp4 | Termux | 16.796 | 0.687 | 2.891 | 360164 |
1918465-uhd_3840_2160_24fps.mp4 | FFmpeg-Kit | 20.59 | 0.658 | 3.565 | 463252 |
1918465-uhd_3840_2160_24fps.mp4 | compare | +23% | -4% | +23% | +29% |
Let's review the log output. On the left side of the screenshot is the output from Termux
, and on the right side is the output from FFmpeg-Kit
.
Here I'm taking VID_20240411_190238_.mp4
as an example, and the logs for the other files are similar. Statistics logs are omitted.
It appears that they both use the decoder called native
, but there is a significant performance difference.
It seems like this is the real issue at hand.
Additionally, I also tested not using the native
decoder, but using the Android hardware accelerated decoder.
Command line used for testing:
ffmpeg -hide_banner -an -benchmark -hwaccel mediacodec -i <input.mp4> -f null -
Log output in Termux
:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/storage/emulated/0/FFmpegTest/VID_20240420_182825_8K_.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
encoder : Lavf60.16.100
Duration: 00:01:08.07, start: 0.000000, bitrate: 105099 kb/s
Stream #0:0[0x1](und): Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709), 7680x4320, 105097 kb/s, 23.99 fps, 24 tbr, 90k tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
[hevc_mediacodec @ 0xb4000073052a0c00] Both surface and native_window are NULL
[hevc_mediacodec @ 0xb4000073052a0c00] Using surface 0x0
[hevc_mediacodec @ 0xb4000073052a0c00] No Java virtual machine has been registered
[hevc_mediacodec @ 0xb4000073052a0c00] Failed to getCodecNameByType
[hevc_mediacodec @ 0xb4000073052a0c00] Output crop parameters top=0 bottom=4319 left=0 right=7679, resulting dimensions width=7680 height=4320
[hevc_mediacodec @ 0xb4000073052a0c00] MediaCodec started successfully: codec = c2.qti.hevc.decoder, ret = 0
Stream mapping:
Stream #0:0 -> #0:0 (hevc (hevc_mediacodec) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[hevc_mediacodec @ 0xb4000073052a0c00] Output MediaFormat changed to android._color-format: int32(2141391876), android._video-scaling: int32(1), android._dataspace: int32(260), color-standard: int32(1), color-range: int32(2), color-transfer: int32(3), sar-height: int32(1), rotation-degrees: int32(0), hdr-static-info: data, sar-width: int32(1), crop: Rect(0, 0, 7679, 4319), width: int32(7680), feature-secure-playback: int32(0), frame-rate: int32(30), hdr10-plus-info: data, height: int32(4320), max-height: int32(4320), max-width: int32(8192), mime: string(video/raw), priority: int32(1), color-format: int32(21), image-data: data, stride: int32(7680), slice-height: int32(4320)}
[hevc_mediacodec @ 0xb4000073052a0c00] Output crop parameters top=0 bottom=4319 left=0 right=7679, resulting dimensions width=7680 height=4320
[hevc_mediacodec @ 0xb4000073052a0c00] Output MediaFormat changed to android._color-format: int32(2141391876), android._video-scaling: int32(1), android._dataspace: int32(260), color-standard: int32(1), color-range: int32(2), color-transfer: int32(3), sar-height: int32(1), rotation-degrees: int32(0), hdr-static-info: data, sar-width: int32(1), crop: Rect(0, 0, 7679, 4319), width: int32(7680), feature-secure-playback: int32(0), frame-rate: int32(30), hdr10-plus-info: data, height: int32(4320), max-height: int32(4320), max-width: int32(8192), mime: string(video/raw), priority: int32(1), color-format: int32(21), image-data: data, stride: int32(7680), slice-height: int32(4320)}
[hevc_mediacodec @ 0xb4000073052a0c00] Output crop parameters top=0 bottom=4319 left=0 right=7679, resulting dimensions width=7680 height=4320
Output #0, null, to 'pipe:':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
encoder : Lavf60.16.100
Stream #0:0(und): Video: wrapped_avframe, nv12(tv, bt709/bt709/smpte170m, progressive), 7680x4320, q=2-31, 200 kb/s, 24 fps, 24 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
encoder : Lavc60.31.102 wrapped_avframe
[out#0/null @ 0xb40000730521cfc0] video:765kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
frame= 1633 fps= 54 q=-0.0 Lsize=N/A time=00:01:08.04 bitrate=N/A speed=2.26x
bench: utime=20.202s stime=7.498s rtime=30.168s
bench: maxrss=505652kB
Log output in FFmpegKit
:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/storage/emulated/0/FFmpegTest/VID_20240420_182825_8K_.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
encoder : Lavf60.16.100
Duration: 00:01:08.07, start: 0.000000, bitrate: 105099 kb/s
Stream #0:0[0x1](und): Video: hevc (hvc1 / 0x31637668), yuv420p(tv, bt709), 7680x4320, 105097 kb/s, 23.99 fps, 24 tbr, 90k tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
[hevc_mediacodec @ 0x70158de800] Both surface and native_window are NULL
[hevc_mediacodec @ 0x70158de800] Using surface 0x0
[hevc_mediacodec @ 0x70158de800] Output crop parameters top=0 bottom=4319 left=0 right=7679, resulting dimensions width=7680 height=4320
[hevc_mediacodec @ 0x70158de800] MediaCodec started successfully: codec = c2.qti.hevc.decoder, ret = 0
Stream mapping:
Stream #0:0 -> #0:0 (hevc (hevc_mediacodec) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[hevc_mediacodec @ 0x70158de800] Output MediaFormat changed to {crop-right=7679, max-height=4320, sar-width=1, color-format=21, slice-height=4320, image-data=java.nio.HeapByteBuffer[pos=0 lim=104 cap=104], mime=video/raw, hdr-static-info=java.nio.HeapByteBuffer[pos=0 lim=25 cap=25], priority=1, stride=7680, color-standard=1, feature-secure-playback=0, color-transfer=3, sar-height=1, hdr10-plus-info=java.nio.HeapByteBuffer[pos=0 lim=0 cap=0], crop-bottom=4319, max-width=8192, crop-left=0, width=7680, color-range=2, crop-top=0, rotation-degrees=0, frame-rate=30, height=4320}
[hevc_mediacodec @ 0x70158de800] Output crop parameters top=0 bottom=4319 left=0 right=7679, resulting dimensions width=7680 height=4320
Output #0, null, to 'pipe:':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
encoder : Lavf60.3.100
Stream #0:0(und): Video: wrapped_avframe, nv12(tv, bt709/bt709/smpte170m, progressive), 7680x4320, q=2-31, 200 kb/s, 24 fps, 24 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
encoder : Lavc60.3.100 wrapped_avframe
frame= 1633 fps= 54 q=-0.0 Lsize=N/A time=00:01:08.04 bitrate=N/A speed=2.25x
video:765kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
bench: utime=22.278s stime=6.367s rtime=30.356s
bench: maxrss=620108kB
As you can see, the decoding performance of Termux and FFmpegKit is exactly the same when using the Android hardware accelerated decoder hevc_mediacodec
.
Thanks for running those tests. I need some time to review them.
I ran your test scenarios on my end. The results from the native decoder are consistent with the figures in your tests. However, in my case, the MediaCodec
decoder in ffmpeg-kit
was also 40% slower.
I've noticed the following differences between the termux
builds and ffmpeg-kit
builds. I believe these differences contribute to the performance gap between the two.
- The
termux
binaries are compiled using a customAndroid NDK
toolchain, whileffmpeg-kit
utilizes the defaultLLVM
toolchain provided with theAndroid NDK
- The
termux
toolchain implements & enables certain native libraries that are not included in theAndroid NDK
- Several external libraries in
termux
are compiled withASM
, which unfortunately wasn't possibe for the same libraries inffmpeg-kit
-
FFmpeg
is compiled with different configuration options
I managed to enable ASM
for x265
on 64bit
Android architectures in the development
branch. This will speed up x265
operations.
There is also a new --toolchain
option defined for android.sh
to override the default NDK llvm
toolchain. That can be used to build ffmpeg-kit
with custom toolchains e.g. termux toolchain.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.
I'm also affected by this issue namely bad decoding performance of HEVC videos recorded via smartphone.
Is there a confimed working solution or workaround for this issue?
I'm also affected by this issue namely bad decoding performance of HEVC videos recorded via smartphone.
Is there a confimed working solution or workaround for this issue?
You're talking about with respect to compression? If so, have you tried the ultrafast preset?
You're talking about with respect to compression? If so, have you tried the ultrafast preset?
I'm taking about the time it takes for transcoding as mentioned above by @tasy5kg
I managed to enable
ASM
forx265
on64bit
Android architectures in thedevelopment
branch. This will speed upx265
operations.There is also a new
--toolchain
option defined forandroid.sh
to override the defaultNDK llvm
toolchain. That can be used to buildffmpeg-kit
with custom toolchains e.g. termux toolchain.
Hi! Is there a chance it will go to the main branch and main version to be available by default?
how are we going to solve this question?