Trevor L. McDonell
Trevor L. McDonell
It sounds like your `liftAcc` is [`foreignAcc`](http://hackage.haskell.org/package/accelerate-1.3.0.0/docs/Data-Array-Accelerate.html#v:foreignAcc): call some foreign code and return the result back into accelerate?
The fallback choice allows you to chain multiple implementations, presumably ending with one written in pure accelerate (or, error). I don't think we can really entirely remove the dependency on...
Generating large test arrays is important for the parallel backends, because they can use different implementations for larger arrays (for example the GPU backends need to use multiple thread blocks...
Hi @noughtmare, sorry for the slow response. This looks great! I wonder will this will work together with heghehog's notion of random number generation; that is, shrinking on failure? At...
I have pulled one of the [PractRand](http://pracrand.sourceforge.net) generators out into the [sfc-random-accelerate](https://github.com/tmcdonell/sfc-random-accelerate) package.
I don't think converting to `Generate` limits the kinds of optimisations you can do. If you want to keep it around throughout fusion, then you need a special case just...
These are the stats from the Trafo stage, at accelerate compile time / Haskell runtime. [Weakening](https://github.com/AccelerateHS/accelerate/blob/master/Data/Array/Accelerate/Trafo/Substitution.hs#L203) is the process of increasing the size of the environment of a term.
Some of the suggestions on how to improve GHC performance apply here too, like improving infrastructure to narrow down problems easily, rather than shooting in the dark (the current situation)....
Update using ghc-8.0.1 and tmcdonell/lulesh-accelerate@b321f401ee1d0349d96de825f67a240508b26c1c ``` > stack exec lulesh-accelerate -- --llvm-cpu +RTS -N -t -RTS +ACC -ddump-simpl-stats -ddump-phases -ACC ... 1.483:phase sharing-recovery: 331.455 ms (wall), 879.342 ms (cpu) 228.5...
Just setting RTS flags `-A128M -n4m` helps enormously, and we go from 57% productivity to 99%. ``` > stack exec lulesh-accelerate -- --llvm-cpu +RTS -N -s -A128M -n4m -RTS +ACC...