feat(p/demo): PCG pseudo-random number generator
Description
Random number generation package using the PCG algorithm
Overview
I used the PCG (Permuted Congruential Generator) algorithm to support most of the methods provided in math/rand. This algorithm is a pseudo-random number generator (PRNG) that generates the same random number sequence whenever the initial seed (state) is given, exhibiting fully deterministic characteristics. It also makes patterns and long period[^1] difficult to predict through the permutation process.
The implementation basically uses a linear congruential method but modifies the results through additional bitwise operations such as XOR. This approach improves the statistical quality of random numbers and reduces predictability.
Compatibility
I tried to match the methods provided in the math/rand package as much as possible, but there are a few differences. The table below compares the main functions between each implementation function name and includes descriptions.
| math/rand | PCG | description |
|---|---|---|
Seed |
Seed |
Sets the seed value (initial state) of the PRNG. |
Int |
Uint32 Uint64 |
Returns a random integer. PCG32 returns 32 bits, and PCG64 returns 64 bits. |
Intn |
Uint32n Uint64n |
Returns a random integer within the given range. |
Float64 |
Float64 |
Returns a floating-point number between 0.0 and 1.0. |
Perm |
Perm |
Returns a randomly shuffled array of integers from 0 to n-1. |
Read |
Read |
Fills a byte slice with random data. |
Shuffle |
Shuffle |
Randomly shuffles the elements of a given slice. |
NormFloat64 |
X | Returns a floating-point number following a standard normal distribution. (Not provided in PCG) |
ExpFloat64 |
X | Returns a floating-point number following an exponential distribution. (Not provided in PCG) |
Performance and Benchmarks
Currently, direct benchmarking is not possible in Gno, so I measured using Go. The Go implementation is named MathRand, and PCG is named PCG in the benchmark identifiers. Please refer to the following link for detailed benchmark code.
Based on the benchmarks I conducted using Go implementations, the PCG algorithm generally outperforms the standard math/rand package in terms of generating single random numbers. PCG32 and PCG64 are approximately twice as fast as their math/rand counterparts. When generating random numbers within a given range, PCG also demonstrates superior performance, with PCG32 and PCG64 being nearly 3 times faster than math/rand's Intn method. In the case of shuffling, PCG32 and PCG64 exhibit better performance compared to math/rand, with PCG32 being the fastest.
However, when it comes to generating random floating-point numbers, PCG implementation is slightly slower than Go's math/rand package. Overall, the PCG algorithm provides improved performance in most scenarios while maintaining high-quality random number generation.
Generate Single Random Number
| Benchmark | # of Iterations | Elapsed Time (ns/op) | # Allocated Bytes (B/op) | Allocations Per Operation |
|---|---|---|---|---|
| BenchmarkPCG32Rand | 441127071 | 2.764 | 0 | 0 |
| BenchmarkMathRand32 | 201966975 | 6.017 | 0 | 0 |
| Benchmark | # of Iterations | Elapsed Time (ns/op) | # Allocated Bytes (B/op) | Allocations Per Operation |
|---|---|---|---|---|
| BenchmarkPCG64Rand | 398531355 | 3.037 | 0 | 0 |
| BenchmarkMathRand64 | 184649094 | 6.665 | 0 | 0 |
Generate Singe Number In Given Range
| Benchmark | # of Iterations | Elapsed Time (ns/op) | # Allocated Bytes (B/op) | Allocations Per Operation |
|---|---|---|---|---|
| BenchmarkPCG32_Uintn32 | 440600071 | 2.820 | 0 | 0 |
| BenchmarkPCG64_Uintn64 | 386100385 | 3.201 | 0 | 0 |
| Benchmark_MathRanIntn | 151258676 | 7.877 | 0 | 0 |
Read
| Benchmark | # of Iterations | Elapsed Time (ns/op) | # Allocated Bytes (B/op) | Allocations Per Operation |
|---|---|---|---|---|
| BenchmarkPCG32Read | 2884942 | 416.1 | 0 | 0 |
| BenchmarkPCG64Read | 2970010 | 404.9 | 0 | 0 |
| BenchmarkMathRandRead | 1403140 | 857.8 | 0 | 0 |
Shuffle
| Benchmark | # of Iterations | Elapsed Time (ns/op) | # Allocated Bytes (B/op) | Allocations Per Operation |
|---|---|---|---|---|
| BenchmarkPCG32Shuffle | 386566 | 3120 | 0 | 0 |
| BenchmarkPCG64Shuffle | 289429 | 4131 | 0 | 0 |
| BenchmarkMathRandShuffle | 270508 | 4389 | 0 | 0 |
Float
Generating random floating-point numbers is slightly slower than Go's implementation.
| Benchmark | # of Iterations | Elapsed Time (ns/op) | # Allocated Bytes (B/op) | Allocations Per Operation |
|---|---|---|---|---|
| BenchmarkPCGFloat64 | 359199702 | 3.046 | 0 | 0 |
| BenchmarkPCGFloat64Full | 400288152 | 3.002 | 0 | 0 |
| BenchmarkMathRandFloat64 | 495593946 | 2.412 | 0 | 0 |
Limitations
However, the current PCG implementation does not guarantee a completely uniform distribution. This is because when applying chi-square[^2] test, the p-value[^3] is smaller than the significance level, leading to the rejection of the null hypothesis. Nevertheless, it exhibits a certain level of uniformity, so it is believed that this will not pose a significant problem for a simple purpose (.
Below are the distributions of random numbers generated by PCG and math/rand under the same conditions. The blue color (top) is PCG. See the link for the code to generate the distribution graph.
[^1]: In a PRNG, the priod refers to the number of generated values before the sequence starts repeating itself. It is a crucial property that determines the quality and effectiveness of the PRNG algorithm. A longer period indicates that the generated sequence will have more randome and less predictable pattern. The PCG is designed to have an extremely long period, typically in the order of 2^64 or higher ref.
[^2]: The chi-square test is a stastical hypothesis test that measures the goodness of fit between the observed and expected frequencies under the assumption of a specific distribution. In this context, it is used to assess the uniformity of the generated random numbers by comparing their distribution to the expected uniform distribution.
[^3]: The p-value is the probability of observing a test stastic as extreme as the one calculated from the sample data. assuming the null hypothesis(H0) is true. In hypothesis testing, if the p-value is smaller than the predetermined significance level (e.g., 0.05 mostly), the H0 is rejected, indicating that the observed data is unlikely to have occurred by chance alone. The alternative hypothesis(H1), in this case, would state the generated random numbers do not follow a uniform distribution.
That's great! Ideally, we should complete porting math/rand to ensure that most Go code using basic functions like pickRandom(choices) can still function properly.
That's great! Ideally, we should complete porting
math/randto ensure that most Go code using basic functions likepickRandom(choices)can still function properly.
Thanks! I'm continuing to work on supporting the full functionality of math/rand as much as possible.
@notJoon even though its still in draft, amazing effort 👏
I'm not the expert on this but, since we are talking blockchains and determinism - I assume you plan to use this package just as a demo package, since the values should be guessable with enough effort from an attacker's perspective. Is this correct?
I'm not the expert on this but, since we are talking blockchains and determinism - I assume you plan to use this package just as a demo package, since the values should be guessable with enough effort from an attacker's perspective. Is this correct?
It is almost impossible to predict random values without knowing the initial state and seed value. However, I have not yet conducted the NIST SP 800-22 tests (well known test suite for PRNG), and a lot of verification is needed to ensure complete security. Therefore, I made it at the demo level.
cc @kristovatlas
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 46.72%. Comparing base (
dd68d61) to head (a93fc36). Report is 1 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #1993 +/- ##
==========================================
- Coverage 48.44% 46.72% -1.72%
==========================================
Files 409 492 +83
Lines 61965 69614 +7649
==========================================
+ Hits 30019 32530 +2511
- Misses 29446 34371 +4925
- Partials 2500 2713 +213
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Over the past few days, I created a test framework based on the NIST SP-800-22 paper to verify the random numbers generated in the implementation and performed the verification. I set the significance level to 0.01 (If the result is greater than this value, it means it is random), and for the remaining values, I used the recommended input values from the paper and ran the tests. The results are as follows: (The universal test was run separately as it requires a large input size)
+---------------------------------------------+---------------------------------------------------------------------------------------------+--------+
| NIST STATISTICAL TEST SUITE | P-VALUE | RESULT |
+---------------------------------------------+---------------------------------------------------------------------------------------------+--------+
| Frequency (Monobit) Test | 0.53 | Pass |
| Frequency Test within a Block | 0.24 | Pass |
| Runs Test | 0.20 | Pass |
| Test for the Longest Run of Ones in a Block | 0.20 | Pass |
| Binary Matrix Rank Test | 0.33 | Pass |
| Discrete Fourier Transform (Spectral) Test | 0.53 | Pass |
| Non-overlapping Template Matching Test | 1.00 | Pass |
| Overlapping Template Matching Test | 0.85 | Pass |
| Linear Complexity Test | 0.81 | Pass |
| Serial Test | [0.82 0.95] | Pass |
| Approximate Entropy Test | 1.00 | Pass |
| Cumulative Sums Test | 0.63 | Pass |
| Random Excursions Test | [0.29 0.76 0.09 0.53 0.58 0.87 0.39 0.83] | Pass |
| Maurer's Universal Statistical Test | 1.00 | Pass |
| Random Excursions Variant Test | [0.32 0.44 0.42 0.30 0.49 0.67 0.61 0.72 0.92 0.35 0.51 0.64 0.78 0.76 0.58 0.77 0.89 0.98] | Pass |
+---------------------------------------------+---------------------------------------------------------------------------------------------+--------+
| | TOTAL TESTS | 15 |
| | PASS | 15 |
| | FAIL | 0 |
+---------------------------------------------+---------------------------------------------------------------------------------------------+--------+
While these tests can demonstrate that the generated random numbers have statistical randomness, they do not guarantee cryptographic security. Although it is difficult to reverse-engineer the seed using only the generated values without knowing the initial state or seed, there is still a possibility, however small.
Nevertheless, Go's math/rand package recommends using it only for parts that do not require security and suggests using crypto/rand (partially implemented in Gno) when security is crucial. Therefore, there seems to be no significant issue with providing random function support itself.
Currently, the initial seed value is set to be specified by the user, as hiding it remains a challenge. If it becomes possible to utilize the operating system's entropy source in the future, this aspect could be improved.
Hey @notJoon, apologies for the late reply on this. And thank you for the great PR as always.
We have a requirement for a random number generator in Gno for an initiative at Gophercon, hence why I created #2455. Because it ports over math/rand/v2, which uses PCG, do you think we need this now?
Hey @notJoon, apologies for the late reply on this. And thank you for the great PR as always.
We have a requirement for a random number generator in Gno for an initiative at Gophercon, hence why I created #2455. Because it ports over math/rand/v2, which uses PCG, do you think we need this now?
Thanks for reply @thehowl. I implemented this as a temporary measure when the random package was not available, so I think it may not have much significance now. Also, there was still some ambiguity because still couldn't access the entropy source.