Performance enhancements for single and multi-thread evaluations
Experiments show that with few changes huge boosts (synthetic tests 3x faster) can be achieved (caused by less synchronization + much better JIT code generation)
Perf enhancements speedup:
Before: $ ./threads.sh 1 Testing THREADS=1 TIME=3000 Finished 17 tests sum=2.605197E11 Thread test finished total=17
$ ./threads.sh 4 Testing THREADS=4 TIME=3000 Finished 12 tests sum=1.689628E11 Finished 12 tests sum=1.929169E11 Finished 12 tests sum=2.02259E11 Finished 11 tests sum=1.705385E11 Thread test finished total=47
After: $ ./threads.sh 1 Testing THREADS=1 TIME=3000 Finished 42 tests sum=6.101973E11 Thread test finished total=42
$ ./threads.sh 4 Testing THREADS=4 TIME=3000 Finished 26 tests sum=3.680051E11 Finished 22 tests sum=3.780026E11 Finished 26 tests sum=4.324088E11 Finished 27 tests sum=3.765051E11 Thread test finished total=101
maybe update the performance numbers in the README?