Add a function to split text at the first occurrence of a character
I was surprised not to find a function in the library to split a text on the first match of a character/text/predicate, with the matched part discarded.
This is not difficult to write using breakOn and drop, but I think a function implemented with the internals of the library might have better performance.
split1 :: Char -> Text -> (Text, Text)
split1 c = (id *** drop 1) . (breakOn $ singleton c)
See a similar function (previously called splitOnce) in the byteslice package, and a question on Stack Overflow about splitAtfirst.
IMHO split1 c = fmap (drop 1) . break (== c) is short enough and does not suffer from any significant performance penalty: drop 1 is constant-time, negligible in comparison to the linear time of break.
Note that the function from byteslice has a different signature, with Maybe.
@Bodigrim comparing
split1 c = fmap (drop 1) . (breakOn $ singleton c)
vs
split1 c = fmap (drop 1) . break (== c)
The first option, using breakOn, should be significantly faster than the second option, no?
Rational:
breakOn can look at the needle and choose the optimal strategy (specifically if c is in the ASCII range, it can just do a sequential scan).
break on the other hand always needs to decode to Char.
Corresponding code: https://github.com/haskell/text/blob/5e57460711a9a5ab7f8a30f0e11cd850018dae70/src/Data/Text/Internal/Search.hs#L49-L56
https://github.com/haskell/text/blob/5e57460711a9a5ab7f8a30f0e11cd850018dae70/src/Data/Text/Internal/Search.hs#L93-L100
@sol yes, I'd expect breakOn to be faster.