rav1e
rav1e copied to clipboard
Disabling CDEF causes a > 80% regression on metrics
Using this patch:
diff --git a/src/encoder.rs b/src/encoder.rs
index 8734473..4f3f130 100644
--- a/src/encoder.rs
+++ b/src/encoder.rs
@@ -213,7 +213,7 @@ impl Sequence {
enable_ref_frame_mvs: false,
enable_warped_motion: false,
enable_superres: false,
- enable_cdef: config.speed_settings.cdef && config.chroma_sampling != ChromaSampling::Cs422,
+ enable_cdef: false,
enable_restoration: config.chroma_sampling != ChromaSampling::Cs422 &&
config.chroma_sampling != ChromaSampling::Cs444, // FIXME: not working yet
operating_points_cnt_minus_1: 0,
leads to this mess:
https://beta.arewecompressedyet.com/?job=master-f9ad26f55ebd5e3a827213b18f5b2534550ead39&job=master-no-cdef%402019-06-28T13%3A12%3A08.894Z
This probably points to a bug in our CDEF disabled codepath.
Did you run cargo test --features=decode_test?
running 85 tests
test api::test::flush_unlimited_reorder_scene_change_detection ... ok
test cdef::test::check_max_element ... ok
test cdef::test::test_padded_frame_copy ... ok
test cdef::test::test_padded_frame_copy_outside_input ... ok
test context::test::cdf_map ... ok
test context::test::cfl_joint_sign ... ok
test ec::test::booleans ... ok
test ec::test::cdf ... ok
test ec::test::mixed ... ok
test ec::test::update_cdf_4_sse2 ... ok
test encoder::test::check_partition_types_order ... ok
test frame::plane::test::copy_from_raw_u8 ... ok
test frame::plane::test::test_plane_pad ... ok
test api::test::flush_low_latency_scene_change_detection ... ok
test me::test::get_sad_same_u16 ... ok
test predict::test::pred_matches_u8 ... ok
test predict::test::pred_max ... ok
test quantize::test::gen_divu_table ... ok
test me::test::get_sad_same_u8 ... ok
test quantize::test::test_tx_log_scale ... ok
test rate::test::bexp64_vectors ... ok
test api::test::flush_unlimited_low_latency_no_scene_change ... ok
test rate::test::blog64_vectors ... ok
test rdo::estimate_rate_test ... ok
test quantize::test::test_divu_pair ... ok
test api::test::flush_unlimited_reorder_no_scene_change ... ok
test api::test::flush_reorder_scene_change_detection ... ok
test api::test::flush_low_latency_no_scene_change ... ok
test api::test::flush_unlimited_low_latency_scene_change_detection ... ok
test api::test::flush_reorder_no_scene_change ... ok
test rate::test::blog64_bexp64_round_trip ... ok
test test_encode_decode::keyframes_aom ... ok
test test_encode_decode::chroma_sampling_422_aom ... ok
test test_encode_decode::high_bit_depth_12_aom ... ok
test test_encode_decode::large_dimension::dimension_512x512_aom ... ok
test test_encode_decode::odd_size_frame_with_full_rdo_aom ... ignored
test test_encode_decode::bitrate_aom ... ok
test test_encode_decode::chroma_sampling_444_aom ... ok
test test_encode_decode::chroma_sampling_420_aom ... ok
test test_encode_decode::high_bit_depth_10_aom ... test test_encode_decode::high_bit_depth_10_aom has been running for over 60 seconds
test test_encode_decode::large_dimension::dimension_1024x1024_aom ... test test_encode_decode::large_dimension::dimension_1024x1024_aom has been running for over 60 seconds
test test_encode_decode::high_bit_depth_10_aom ... ok
test test_encode_decode::quantizer_100_aom ... ok
test test_encode_decode::large_dimension::dimension_2048x2048_aom ... test test_encode_decode::large_dimension::dimension_2048x2048_aom has been running for over 60 seconds
test test_encode_decode::reordering_short_video_aom ... ok
test test_encode_decode::quantizer_120_aom ... ok
test test_encode_decode::quantizer_60_aom ... ok
test test_encode_decode::quantizer_80_aom ... ok
test test_encode_decode::small_dimension::dimension_256x256_aom ... ok
test test_encode_decode::small_dimension::dimension_258x258_aom ... ok
test test_encode_decode::small_dimension::dimension_260x260_aom ... ok
test test_encode_decode::speed_0_aom ... ignored
test test_encode_decode::speed_10_aom ... ignored
test test_encode_decode::speed_1_aom ... ignored
test test_encode_decode::speed_2_aom ... ignored
test test_encode_decode::speed_3_aom ... ignored
test test_encode_decode::speed_4_aom ... ignored
test test_encode_decode::speed_5_aom ... ignored
test test_encode_decode::speed_6_aom ... ignored
test test_encode_decode::speed_7_aom ... ignored
test test_encode_decode::speed_8_aom ... ignored
test test_encode_decode::speed_9_aom ... ignored
test test_encode_decode::small_dimension::dimension_262x262_aom ... ok
test test_encode_decode::tiny_dimension::dimension_128x128_aom ... ok
test test_encode_decode::tiny_dimension::dimension_16x16_aom ... ok
test test_encode_decode::reordering_aom ... ok
test test_encode_decode::tiny_dimension::dimension_32x32_aom ... ok
test test_encode_decode::tiny_dimension::dimension_8x8_aom ... ok
test tiling::tiler::test::test_tile_area ... ok
test tiling::tiler::test::test_tile_blocks_area ... ok
test tiling::tiler::test::test_tile_blocks_write ... ok
test tiling::tiler::test::test_tile_iter_len ... ok
test tiling::tiler::test::test_tile_motion_vectors_write ... ok
test tiling::tiler::test::test_tile_restoration_edges ... ok
test tiling::tiler::test::test_tile_restoration_write ... ok
test tiling::tiler::test::test_tile_write ... ok
test tiling::tiler::test::test_tiling_info_from_tile_count ... ok
test transform::test::log_tx_ratios ... ok
test transform::test::roundtrips_u16 ... ok
test transform::test::roundtrips_u8 ... ok
test util::sanity ... ok
test test_encode_decode::tiny_dimension::dimension_64x64_aom ... ok
test test_encode_decode::small_dimension::dimension_264x264_aom ... ok
test test_encode_decode::small_dimension::dimension_265x265_aom ... ok
test test_encode_decode::tile_encoding_with_stretched_restoration_units_aom ... ok
test test_encode_decode::low_bit_depth_aom ... ok
test test_encode_decode::large_dimension::dimension_1024x1024_aom ... ok
test test_encode_decode::large_dimension::dimension_2048x2048_aom ... ok
test result: ok. 73 passed; 0 failed; 12 ignored; 0 measured; 0 filtered out
Running target/debug/deps/rav1e-b6a91d22f794d941
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Running target/debug/deps/aom-089995dff9b2c60d
running 13 tests
test bindgen_test_layout___fsid_t ... ok
test bindgen_test_layout_aom_codec_ctx ... ok
test bindgen_test_layout_aom_codec_ctx__bindgen_ty_1 ... ok
test bindgen_test_layout_aom_codec_dec_cfg ... ok
test bindgen_test_layout_aom_codec_frame_buffer ... ok
test bindgen_test_layout_aom_codec_stream_info ... ok
test bindgen_test_layout_aom_image_rect ... ok
test bindgen_test_layout_aom_image ... ok
test bindgen_test_layout_aom_inspect_init ... ok
test bindgen_test_layout_aom_postproc_cfg ... ok
test bindgen_test_layout_av1_ref_frame ... ok
test bindgen_test_layout_imaxdiv_t ... ok
test bindgen_test_layout_cfg_options ... ok
test result: ok. 13 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests rav1e
running 1 test
test src/lib.rs - version (line 101) ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
No desyncs that I can see with CDEF disabled. The rate is simply much higher for the same quality. The curves make sense at least; high gains at low rates, smaller gains at high rates. I thought that something may be messing with the RD cost calculation in RDO, but disabling cdef_dist
did not change results significantly.
Red kayak does not seem to care about CDEF much (disabling cdef_dist
actually slightly improves its PSNR with CDEF disabled). It seems that massive losses are concentrated in mostly static scenes.
No desyncs that I can see with CDEF disabled. The rate is simply much higher for the same quality. Are the numbers really that odd? The curves make sense at least; high gains at low rates, smaller gains at high rates. I thought that something may be messing with the RD cost calculation in RDO, but disabling
cdef_dist
did not change results significantly.Red kayak does not seem to care about CDEF much (disabling
cdef_dist
actually slightly improves its PSNR with CDEF disabled). It seems that massive losses are concentrated in mostly static scenes.
@rzumer so I tested disabling it at the frame header level, and it seems like there are code paths that break with only that change.
After looking at bit distribution in the analyzer it seems the self-guided filter is running loose with CDEF disabled; every frame allocated over 6000 bits to it at q=252 on Kristen. The normal distribution with CDEF enabled seems to be a few thousand bits on intra frames, and nothing on inter frames.
Comparative run (no LRF + no CDEF => only no CDEF): https://beta.arewecompressedyet.com/?job=master-no-cdef-lrf%402019-07-23T22%3A54%3A26.767Z&job=master-no-cdef%402019-07-23T19%3A16%3A53.957Z
https://github.com/xiph/rav1e/blob/master/src/rdo.rs#L1454 I believe best_cost[pli] will be < 0 if cdef is disabled. This would cause sgr to always be used.
https://github.com/xiph/rav1e/blob/master/src/rdo.rs#L1454 I believe best_cost[pli] will be < 0 if cdef is disabled. This would cause sgr to always be used.
@KyleSiefring yeah, accounting for that may have done it (although this fix may be too naive):
diff --git a/src/rdo.rs b/src/rdo.rs
index 4889b54..be42f93 100644
--- a/src/rdo.rs
+++ b/src/rdo.rs
@@ -1497,7 +1497,7 @@ pub fn rdo_loop_decision<T: Pixel>(tile_sbo: TileSuperBlockOffset, fi: &FrameInv
let rate = cw.count_lrf_switchable(w, &ts.restoration.as_const(), current_lrf, pli);
let frame_bo = ts.to_frame_super_block_offset(tile_sbo).block_offset(0, 0);
let cost = compute_rd_cost(fi, frame_bo, BlockSize::BLOCK_64X64, rate, err);
- if best_cost[pli] < 0. || cost < best_cost[pli] {
+ if (fi.sequence.enable_cdef && best_cost[pli] < 0.) || cost < best_cost[pli] {
best_cost[pli] = cost;
best_lrf[pli] = current_lrf;
lrf_change = true;
AWCY results with the above fix + CDEF disabled:
https://beta.arewecompressedyet.com/?job=master-8b5936e33ea49a9320a13b0ef0103956684340e1&job=cdef-disabled%402019-08-07T04%3A20%3A03.200Z
On checking now, https://beta.arewecompressedyet.com/?job=master-e19269752954379f4aa1033010e84489e85ef73d&job=rav1e-no-cdef%402021-05-23T00%3A26%3A47.352Z
An average of 4-12% is noticed, I think this seems normal