rav1e icon indicating copy to clipboard operation
rav1e copied to clipboard

[WIP] Motion Estimation Overhead

Open KyleSiefring opened this issue 4 years ago • 2 comments

Written as of 01052fc7121114904501f81f52a234c96373829d.

I'm going to start at the tail end of the call stack and work up from there.

get_mv_rate

https://github.com/xiph/rav1e/blob/01052fc7121114904501f81f52a234c96373829d/src/me.rs#L1015-L1031

I think this could use a rewrite. Checking if d == 0 shouldn't be necessary and ilog would work here. Unfortunately, the baseline x86/x86_64 will still have a branch here because bsr is undefined for 0. In a more positive light, the compiler manages to implement the conditional shift without a branch a la diff >> ((allow_high_precision_mv as i16) ^ 1).

compute_mv_rd_cost

https://github.com/xiph/rav1e/blob/01052fc7121114904501f81f52a234c96373829d/src/me.rs#L852-L869

We have a branch for calling sad vs satd. We call get_mv_rate here twice. There are definitely places in the code where both pmvs are always the same. There will be some dependency chains when reducing the results in sad/satd, so this might not be super critical. You also have a multiply at the end which will be fairly expensive.

It might be worth skipping rdo for half res and quarter res search. The code already decreases lambda in those cases. I have done awcy runs testing this and the results are seem favorable. My concern is that our coarse search is primitive.

predict_inter, predict_inter_single, predict_inter_compound

https://github.com/xiph/rav1e/blob/01052fc7121114904501f81f52a234c96373829d/src/predict.rs#L280-L413

predict_inter_single is called from get_mv_rd_cost (motion search) and predict_inter is called in motion_compensate. Every time predict_inter_single is called, you need to index three time are and check an Option. This is silly. Our motion search is already working on a single reference frame at a time and has the plane loaded. Also, it would be very concerning if the if let is not to be taken for most, if not all, the code calling predict_inter and friends. diamond_search (I haven't checked full search) even creates an uninitialized buffer for subpel.

After indexing and handling the Option, we need to get the appropriate PlaneSlice from a Plane. First, we get a frame offset from our tile. We are already using frame offsets when calling this from motion search. We have .slice(qo).clamp().subslice(3, 3) which is a bit weird. Unless this is in the spec, silently changing the input parameters is probably not the greatest. Doing a bounds check may be preferable.

We have our PlaneSlice. Is it over? No. Inside put_8tap (and prep_8tap), we must get the appropriate pointer from the x and y coordinates stored in the PlaneSlice. To do this we must do a compute base_ptr + row * stride + col. Once we are searching subpel, the row and col should only change by one and thus wouldn't need to multiply. For a diamond search, you could perform some form of bit bashing or lookup to alter the row (e.g. (untested) if ((center_row - step) ^ center_row) & 16) != 0 { stride } else { 0 }.).

predict_inter_compound needs to do all this twice, but is called through 'motion_compensate'. That code works with multiple reference frames and multiple planes, but there is still some duplication.

Personally, I feel the option inside these functions is a bad, error prone design and it would be prudent to extract the ref frame separately.

To Be Continued...

TODO: get_mv_rd_cost calls predict inter_single and compute_mv_rd_cost

TODO: diamond_me_search calls get_mv_rd_cost

TODO: full_search calls compute_mv_rd_cost

TODO: motion_compensate calls predict_inter

KyleSiefring avatar Jul 15 '20 01:07 KyleSiefring

Re "We have a branch for calling sad vs satd." that branch is likely being elided by the compiler since we always inline this function. Other than that these seem like reasonable points of improvement.

shssoichiro avatar Jul 15 '20 04:07 shssoichiro

diamond_me_search passes use_satd to get_mv_rd_cost and use_satd is also determined by a runtime option.

KyleSiefring avatar Jul 15 '20 12:07 KyleSiefring