kvazaar icon indicating copy to clipboard operation
kvazaar copied to clipboard

Performance difference between tiled v/s non-tiled encoding

Open shashi-banger opened this issue 2 years ago • 5 comments

I am using ffmpeg with libkvazaar.

Input Video: https://test-videos.co.uk/vids/bigbuckbunny/mp4/h264/1080/Big_Buck_Bunny_1080_10s_30MB.mp4

Command without tile: preset=ultrafast

ffmpeg -i ~/sb_media/Big_Buck_Bunny_1080_10s_30MB.mp4 -vcodec libkvazaar -kvazaar-params preset=ultrafast,slice=tiles,mv-constraint=frametile -vb 3000000 ~/sb_media/bbb.mov

Above command gives a 50fps performance on i7-9750H(9th gen intel).

Command with tile: preset=ultrafast and non-uniform tiling

ffmpeg -i ~/sb_media/Big_Buck_Bunny_1080_10s_30MB.mp4 -vcodec libkvazaar -kvazaar-params tiles-width-split="128,1792",tiles-height-split="832,960",preset=ultrafast,slice=tiles,mv-constraint=frametile -vb 3000000 ~/sb_media/bbb.mov

Above command gives a 8fps performance on i7-9750H(9th gen intel).

The difference in performance between the above two command seems large. Am I missing something here? Is there a specific reason for this performance difference?

Also --threads option doesn't seem to have any impact on performance. With --threads option I was expecting the encoder to use multiple cores to push performance, but didn't seem to take effect.

shashi-banger avatar Sep 07 '21 05:09 shashi-banger

Because the HEVC standard prohibits WPP with tiles, the level of parallelism is limited to the tiles. Since you create widely different sized tiles (image below) the red tiles will be encoded much faster than the dark green tile, i.e., the encoder is forced to work in a single threaded fashion for most of the encoding.

If you want to improve the parallelism you have to split the video to more equal sized tiles, or use all intra coding, though this will increase the needed bitrate by a factor of 4-8×. image

Jovasa avatar Sep 07 '21 06:09 Jovasa

Yes, as @Jovasa said, the workload is uneven without the possibility to use WPP while using tiles. You can try to increase --owf=N, where N is the number of concurrent frames Kvazaar is processing. There's also limitations but basically if you have intra period of 64 and you set owf to 64 it can at least process the two intra frames in parallel ;)

fador avatar Sep 07 '21 06:09 fador

Thank you. Using owf does indeed help in parallelism. Here is a modified command for which I got a 20fps performance.

ffmpeg -i ~/sb_media/Big_Buck_Bunny_1080_10s_30MB.mp4 -vcodec libkvazaar -kvazaar-params tiles-width-split="128,1792",tiles-height-split="832,960",preset=ultrafast,slice=tiles,mv-constraint=frametile,owf=150,threads=4 -g 15 -keyint_min 15 -vb 3000000 ~/sb_media/bbb.mov

But owf option seems to be introducing artifcats. Here is an example command for which some obvious artifacts were observed

ffmpeg -i ~/sb_media/Big_Buck_Bunny_1080_10s_30MB.mp4 -vcodec libkvazaar -kvazaar-params tiles-height-split="896",preset=ultrafast,slice=tiles,mv-constraint=frametile,owf=15,threads=4 -g 15 -keyint_min 15 -vb 3000000 ~/sb_media/bbb.mov

shashi-banger avatar Sep 07 '21 08:09 shashi-banger

Unfortunately the owf does not always play nicely with ratecontrol. If you want absolutely best ratecontrol performance you are limited to owf=0. However, setting the owf to a multiple of gop length when random access gop is used, i.e., 8 or 16, should provide the best result, whereas multiple of the gop length - 1, i.e., 15 in your second case is the most likely to produce artifacts.

There is a second rc-algorithm which is less dependent on the owf value, however unfortunately it does not currently work correctly with tiles.

Jovasa avatar Sep 07 '21 08:09 Jovasa

@Jovasa thank you, for your response. As suggested by you, the following command does not result in "obvious" artifacts.

ffmpeg -i ~/sb_media/Big_Buck_Bunny_1080_10s_30MB.mp4 -vcodec libkvazaar -kvazaar-params tiles-height-split="896",preset=ultrafast,slice=tiles,mv-constraint=frametilemargin,owf=120,threads=4 -g 15 -keyint_min 15 -vb 3000000 ~/sb_media/bbb.mov

shashi-banger avatar Sep 07 '21 11:09 shashi-banger