fixed-vector
fixed-vector copied to clipboard
mk# funcition do not fuse for some reason
Test case:
import qualified Data.Vector.Fixed as F
import Data.Vector.Fixed.Unboxed (Vec3)
main :: IO ()
main =
print $ F.sum (F.mk3 1 2 3 :: Vec3 Double)
Maybe you provide not enough optimization flags for GHC? For this test I have the following Core
:
main3
main3 =
\ @ s_a2nv s_a2xF ->
case newByteArray# 24 (s_a2xF `cast` ...)
of _ { (# ipv_a2zv, ipv1_a2zw #) ->
case writeDoubleArray# ipv1_a2zw 0 1.0 ipv_a2zv
of s#_a2DU { __DEFAULT ->
case writeDoubleArray# ipv1_a2zw 1 2.0 s#_a2DU
of s#1_X2EE { __DEFAULT ->
case unsafeFreezeByteArray#
ipv1_a2zw (writeDoubleArray# ipv1_a2zw 2 3.0 s#1_X2EE)
of _ { (# ipv2_a2x5, ipv3_a2x6 #) ->
(# ipv2_a2x5 `cast` ..., (ByteArray ipv3_a2x6) `cast` ... #)
}
}
}
}
main_str
main_str =
case (runSTRep main3) `cast` ... of _ { ByteArray arr#_a2FZ ->
case indexDoubleArray# arr#_a2FZ 0 of wild1_a2G9 { __DEFAULT ->
case indexDoubleArray# arr#_a2FZ 1 of wild2_X2Gr { __DEFAULT ->
case indexDoubleArray# arr#_a2FZ 2 of wild3_X2Gy { __DEFAULT ->
let {
ww_a2Hv
ww_a2Hv = +## (+## wild1_a2G9 wild2_X2Gr) wild3_X2Gy } in
case <## ww_a2Hv 0.0 of _ {
False ->
case {__pkg_ccall base isDoubleNegativeZero Double#
-> State# RealWorld -> (# State# RealWorld, Int# #)}_a2HL
ww_a2Hv realWorld#
of _ { (# _, ds1_a2HQ #) ->
case ds1_a2HQ of _ {
__DEFAULT ->
: $fShowDouble4
(++
($w$sformatRealFloat FFGeneric (Nothing) (negateDouble# ww_a2Hv))
([]));
0 -> ++ ($w$sformatRealFloat FFGeneric (Nothing) ww_a2Hv) ([])
}
};
True ->
: $fShowDouble4
(++
($w$sformatRealFloat FFGeneric (Nothing) (negateDouble# ww_a2Hv))
([]))
}}}}}
...
I think it is pretty good. With (Double, Double, Double)
instead of Vec3 Double
GHC boils vector ops just to the constant 6.0
.
My run:
ghc --make -O2 -funbox-strict-fields -funfolding-keeness-factor1000 -fexpose-all-unfoldings \ -ddump-simpl -dsuppress-all fv.hs > fv-core.hs
This core illustrate problem nicely. Intermediate vector created by mk3
is not eliminated although it should be. For some reason rule cvec/vector
din't fire. For other functions it does fire. For example if one replaces mk3
with replicate
rule fires nicely and GHC is able to reduce expression to constant.
Unboxed vectors are particularly suitable for testing fusion. They are completely opaque so GHC cannot tear them down like it did with tuple.