pl_mpeg
pl_mpeg copied to clipboard
Slight performance optimizations
One of the most time consuming tasks in pl_mpeg is actually reading the buffers, especially because every single read checked if the buffer still had enough data.
This change creates _unchecked
versions of plm_buffer_read
and plm_buffer_skip
, which, as the name implies, doesn't check for the amount of available data still left.
To compensate, plm_buffer_has
has been added to many places where the needed amount of available data can be figured out beforehand, so all _unchecked
reads should be guaranteed to be safe.
I also added plm_buffer_is_aligned
, which checks for bit alignment to a byte, plm_buffer_read_byte
, which checks for enough buffer data available and bit alignment and plm_buffer_read_byte_unchecked
, which actually directly reads the byte from the buffer without checking for the remaining buffer length or bit alignment.
A very small optimization to plm_video_idct
was also added, preventing an avoidable sign flip to the y7
calculation by swapping out all remaining signs.
Some warnings specific to Visual Studio were also removed.
Overall, this yields a 5% to 7% performance improvement in my test cases.
As a note, I tried fiddling with SIMD, especially on plm_video_idct
. I did get it to work but the performance was either worse (using SSE4.1) or only marginally (<1%) better (with AVX2), so I scrapped that idea.