FileConverter
FileConverter copied to clipboard
Implement hardware acceleration
This PR addresses Issue #392.
I've added some Nvidia CUDA FFMPEG arguments for the mp4 format for now, but this is a good start. What remains is:
- [ ] Supporting AMD acceleration
- [x] Supporting NVIDIA acceleration
- [ ] Implementing the arguments in other video formats as well (right now only for mp4)
- [x] Use localization for the option strings
- [X] Adding a setting in the program interface to control what type of acceleration to use (off by default)
- [X] Maybe: hardware acceleration for the entire transcoding process. Right now only encoding and decoding are accelerated.
- This isn't a convenient feat. For CUDA it requires changing filter names from
scaletoscale_cuda, orcrfintoqpfor example. I tried doing this but couldn't get it to not error. It requires much experimentation.
- This isn't a convenient feat. For CUDA it requires changing filter names from
All help is greatly appreciated 😀 This is the first C# program I edit...
The results I got adding these arguments (which accelerate only the encode/decode part of the process) are 2-3x faster than before, transcoding a 176 MB video into ~8MB using the To Mp4 (low quality) preset.
Development
I couldn't find much information on how to contribute to this program, but this is what I've learned so far:
Installation
The Magick.Native-Q16-x64.dll file must be copied to bin/x64/Debug for the program to compile. To obtain this file you need to follow the Magick.NET compilation guide and then grab it from C:\Users\xxxx\.nuget\packages\magick.native.
Testing
The project can be built in its entirety, and then the installer can be run, but this requires a system restart. A more efficient option is calling FileConverter.exe directly. After building the FileConverter solution, you should be able to find Application/FileConverter/bin/x64/Debug/FileConverter.exe
Opening this file will give you the tutorial window. But if you run it from the command line like so:
.\FileConverter.exe --verbose --conversion-preset "To Mp4 (low quality)" "C:\Users\DevAccount\Desktop\cs.mp4"
It is equivalent to right clicking a preset in the context menu, without all the extra steps.
Resources:
- https://docs.nvidia.com/video-technologies/video-codec-sdk/12.0/ffmpeg-with-nvidia-gpu/index.html
- https://trac.ffmpeg.org/wiki/HWAccelIntro
- https://github.com/HeiSir2014/ffmpeg-wiki
- https://stackoverflow.com/a/55747785
- https://lists.ffmpeg.org/pipermail/ffmpeg-user/2017-July/036820.html
- various StackExchange answers you might find
Managed to get full transcoding working. It doesn't provide as much of a speed up as accelerated encoding and decoding, but for long videos it'll definitely be super useful.
Benchmarks
Note: when writing ffmpeg.exe, it refers to FileConverter\Application\FileConverter\bin\x64\Debug\ffmpeg.exe, I'm not using the system ffmpeg.
Commands
Instead of modifying the program and testing every single time, I modified the effective ffmpeg command used, and implemented my modifications after getting everything working.
These following commands are executed for the To Mp4 (low quality) preset (except I'm modifying -n to -y and adding -benchmark). To see the execution time, look at rtime=1.234s in the ffmpeg benchmark output.
Acceleration off
This is used by FileConverter by default.
ffmpeg.exe -y -stats -i "input.mp4" -c:v libx264 -preset medium -crf 31 -c:a aac -qscale:a 0.75 -vf "scale=trunc(iw*1/2)*2:trunc(ih*1/2)*2,format=yuv420p" "output.mp4" -benchmark
HW accelerated encoding and decoding, CPU scaling
ffmpeg.exe -y -stats -hwaccel cuda -i "input.mp4" -c:v h264_nvenc -preset medium -crf 31 -c:a aac -qscale:a 0.75 -vf "scale=trunc(iw*1/2)*2:trunc(ih*1/2)*2,format=yuv420p" "output.mp4" -benchmark
Fully HW accelerated transcoding
ffmpeg.exe -y -stats -hwaccel cuda -hwaccel_output_format cuda -i "input.mp4" -c:v h264_nvenc -preset medium -crf 31 -c:a aac -qscale:a 0.75 -vf "scale_cuda=trunc(iw*1/2)*2:trunc(ih*1/2)*2:format=yuv420p" "output.mp4" -benchmark
Results
input.mp4 is a 30 second long 1920x1080p 176 MB file.
To Mp4 (low quality) (1x scaling)
- HW accel off: 14.6s
- HW accelerated encode/decode: 5.7s (2.56x faster than base)
- Fully accelerated transcode: 5.3s (2.75x faster than base)
To Mp4 (lowER quality) (0.5x scaling)
This preset I made changes the scaling from 100% to 50%.
- HW accel off: 6.2s
- HW accelerated encode/decode: 4.3s (1.44x faster than base)
- Fully accelerated transcode: 3.3s (1.87x faster than base)
I compiled @tacheometry's version and ran a few tests. 3x runs on Hardware acceleration mode = Nvidia (CUDA) & 3x runs on Hardware acceleration mode = Off. Each time the results from CUDA were atleast 2 times faster than with CPU proccessing.
@Tichau Can I get a review on this?
I thinks the easiest way is replace the ffmpeg file in file-converter with a self-complied version that enable all gpu-video-process feature enable. So it will work on any machine even it have NVIDIA or Intel or AMD graphic card.
The Magick.NET compilation guide doesn't exist anymore ¯_(ツ)_/¯