(Unboxed.Vector Char).takeWhile is much slower then Data.Text.takeWhile
Hi!
I know that Data.Text has different underlying representation, but no matter what it is, it should not be faster than Unboxed Vector Char when performing takeWhile. Both libraries implement takeWhile with the same signature of takeWhile :: (Char -> Bool) -> X -> X, so they are comparing Char by Char. Unboxed Vector know the shift between elements, so if Vector.Unboxed.takeWhile is 30% slower than Data.Text.takeWhile, I think we should consider it as a bug, right?
Just to document my findings from last night:
The likely reason for this is that vector implements takeWhile as streaming. If you follow the chain of calls, it will stream out of the original vector, use takeWhile on streams, and stream into a new vector.
text has two possible implementations of takeWhile. The definition is a loop that slices the original text value. Then there are rewrite rules to turn calls into a streaming takeWhile. But there are also rewrite rules back to the loop-and-slice code if fusion does not occur.
The test case was effectively takeWhile (const True) on a big text/vector. In this case, text's approach is categorically superior, because it will just return the original text value. In general there is a trade off between the slicing code being faster (because it doesn't require building a new array), but keeping possibly unused memory live (i.e. slicing big vectors into tiny ones without copying is bad).
Maybe it's possible to make this better a more intelligent bundle takeWhile. I'll have to think about it.
@dolio: It's great you've found out what could be the case. I would be super interested in knowing if we are in general able in Haskell do it more intelligent (cover both use cases with fancy rewrite rules). If not, maybe we should provide both ways in Vector package, so user will decide what is important for him in a particular situation?
@dolio: I'm sorry to disturb you, but I just would love to know if you've thinked about it and if you see any simple / straightforward solution for this problem?