quickcheck
quickcheck copied to clipboard
Random generation is very slow
For testing my binding to the lz4 compression library (https://github.com/nh2/lz4-frame-conduit), I need to generate a lot of large ByteStrings.
Unfortunately I can barely use quickcheck for this, because its random generator is too slow.
randomonly generates only 5-8 MB/s (https://github.com/haskell/random/issues/51)tf-randomgenerates 3-8 MB/s in the same test
When actually generating ByteStrings using quickcheck-instances and size = 1000 on e.g.
QC.property $ \(bsList :: [ByteString]) -> QCM.monadicIO $ do
QCM.run $ hPutStrLn stderr (show $ sum $ map BS.length bsList)
QCM.assert True
it generates me 1 MB/s; generating [String] is only slightly faster.
These numbers are very low compared to the 230 MB/s /dev/urandom gets me on the same machine.
Can we do something about it?
Something is weird; when I last time benchmarked tf-random was 5-10 time faster than random. How you generate your ByteStrings?
quickcheck-instances way of generating is not optimal, but nobody mentioned it's too slow for their needs (it generates list of Word8s, and not Word32 which tf-random generates natively; so it's at least 4x slower than it could be).
How you generate your
ByteStrings?
Using quickcheck-instances, as you said.
I'm now trying out to work around this slowness by using pcg-random (with a seed and target length generated by QuickCheck's choose) to generate the ByteStrings. I've measured that this works at ~300 MB/s, even faster than /dev/urandom. But of course using this workaround isn't great.
Something is weird; when I last time benchmarked
tf-randomwas 5-10 time faster thanrandom.
Hmm, let me give that a "quick check".
@nh2 could you try this https://github.com/phadej/qc-instances/compare/master...phadej:faster-bytestring version of quickcheck-instances, how much faster than released it is in your case (and how much slower than pcg-random) ?
In the simple benchmark there, it was already 20x faster.
(I'll resort to crreateAndTrim next)
I think that the best place to fix this particular problem is in the Arbitrary ByteString instance (e.g. using @phadej's patch).
More generally, there are a couple of things that might cause Gen to be slow:
Genuses splitting to distribute random numbers throughout the computation, which may be slower than generating random numbers in the traditional way. The fix to this would be to changeGeninto a state monad, but I don't want to do that as it would no longer be possible to generate infinite structures.- When I looked at the Core generated by GHC a while ago, it seemed rather suboptimal. I seem to remember a lot of combinators were not getting specialised as one would hope. I've been meaning to look more into this but not got round to it yet.
@nick8325
there are a couple of things that might cause
Gento be slow
Just a thought, but generating infinite structures only works when the bottom of the monad stack is Identity, right? If so, perhaps it's worth considering StateT over splitting(?)
https://hackage.haskell.org/package/quickcheck-instances-0.3.20 released with faster ByteString generator
Seeing as this was fixed in quickcheck-isntances I'm going to close this.