rapidcheck icon indicating copy to clipboard operation
rapidcheck copied to clipboard

Added unicode generators

Open P-Andersson opened this issue 9 years ago • 2 comments

No documentation yet, though the implementation is tested and supports utf8 I'm looking for some feedback regarding a few issues.

  • Currently, the fundamental part here is the Unicode Codepoint, an integer which is identical to utf32 encoding. I'm not sure if it is a good idea to keep these concepts separate from each other or not.
  • The algorithm used to generate the codepoint heavily favors the low end of the possible range of values (the ASCII range) though higher values still come up a significant amount of the time. The reasoning is that characters that are likely to be specially treated (newlines, spaces, tabs...) occur there. This also gives a better distribution of byte sizes for utf8 than a pure uniform random distribution of codepoint values would.
  • Currently, it only generates valid unicode values (though still values that has been assigned no symbols yet). I'm not sure if invalid ones are interesting to create.

P-Andersson avatar Mar 16 '16 20:03 P-Andersson

Nice! I will review this as soon as I have the time, have been a little busy lately :)

emil-e avatar Mar 17 '16 20:03 emil-e

Will continue review later.

emil-e avatar Mar 19 '16 17:03 emil-e