arcode-rs icon indicating copy to clipboard operation
arcode-rs copied to clipboard

How to select appropriate number of bits for precision?

Open danieleades opened this issue 2 years ago • 4 comments

How do you select an appropriate number of bits to use for precision?

Related stack overflow question - https://stackoverflow.com/questions/71509576/what-is-the-optimum-precision-to-use-in-an-arithmetic-encoder

danieleades avatar Mar 17 '22 09:03 danieleades

from my own reading and experimentation, i'm finding that-

  • the precision bits must be at least 2 greater than the number of bits needed to represent that probability of each symbol
  • using more precision bits results in a better compression ratio
    • the gains are very minimal and saturate quickly
  • the total number of bits is constrained by the width of the type used (say u32 or u64)

haven't investigated performance implications.

danieleades avatar Mar 17 '22 13:03 danieleades

Check out the “Constraints on the Implementation” section from the paper referenced in the read me. It discusses choosing values and limitations of precision.

https://web.stanford.edu/class/ee398a/handouts/papers/WittenACM87ArithmCoding.pdf

cgbur avatar Mar 17 '22 13:03 cgbur

Thank you for the reference! Unfortunately that article doesn't add much that's new-

  • the precision bits must be at least 2 greater than the number of bits needed to represent that probability of each symbol
  • more bits is better
  • there are some 'special' numbers which lend themselves to faster computation (ie 16)

still doesn't say anything about the rate of trade-off between precision and other factors. For example, why not use u64 instead of u32? or even u128? In what regimes do the trade-offs matter?

For my own implementation i've chose to set the precision automatically (to the maximum value), and panic if there's not enough bits to guarantee no overflow/underflow. Considering also making the integer representation a generic, so you could effectively choose u32 or u64.

I'm also interested in experimenting to see whether this can be checked at compile time using const generics... (hopefully without polluting the API too badly)

see https://github.com/danieleades/arithmetic-coding/pull/7/files#diff-9a5ef4cd6ac7a66abd842b474ee88447516ed3bd218748eb6f387a0895d14fcb

danieleades avatar Mar 17 '22 13:03 danieleades

Hey sorry I went so absent on this project have not been keeping up. I just want to drop a link to this https://github.com/Cyan4973/FiniteStateEntropy If you are still interested in lossless probability symbol coders. Very very fast and the ANS version is as good as AC but much faster. AFAIK lacks updating symbol table per encode?

cgbur avatar Jan 28 '23 05:01 cgbur