text
text copied to clipboard
Add Data.Text.Lazy.Builder.toText
This is a trivial composition, but it has been a recurring inconvenience for me for years. Particularly when I'm writing introductory tutorials, it is difficult to explain why among the three major types (strict text, lazy text, and builder) only five of the six possible conversion functions are given in the library.
But there is a reason. If you make a Builder
, then convert to lazy Text
and then again to strict Text
you are most likely doing some completely unnecessary copying, instead of directly doing thing you want to do with lazy Text
.
I'm not convinced it is a good idea to have this function. If you really need that conversion, write it explicitly in your code, and wonder whether you can afford the conversion. Hiding this fact, especially in introductionary tutorials, is not good idea IMO.
The same reasoning there isn't way to convert bytestring
's Builder
to strict ByteString
.
Perhaps then should Data.Text.Lazy.toStrict
have a warning on it, or perhaps even be deprecated, for the same reason? Or does the principle that lazy text ought to be used directly, rather than be made strict, only apply to lazy text that was produced using Builder
?
I frequently need this function, as well. When I need it, I don't normally find myself thinking, "I wonder if I can afford the conversion." Instead, I normally find myself thinking "Dammit, why does this library function (or record) take strict Text
? I guess I'd better stop what I'm doing, add an import to the top of my file, and do the conversion." Why am I using such terrible libraries, you might ask? Well, the function had to take something and strict Text
is just what they decided.
To me, the above reasoning behind not including the one missing conversion is flimsy at best and at worst creates a sliding goalpost. As already pointed out above, less-than-optimal functions already abound. Why is the policy on admitting only the most optimal functions not uniformly followed? The answer is that it'd make for a library that nobody would want to use.
Finally, while I understand the passion for conceptual understanding that motivates opinions such as, "Hiding this fact, especially in introductionary tutorials, is not good idea IMO," I am obliged to point out that the literature disagrees with you. Identification of learners' zone of proximal development combined with careful scaffolding is an important component in lesson design, even for adult learners (https://journals.sagepub.com/doi/full/10.3102/0034654316670999). All this amounts to not overwhelming the learner with too much information at once. I'm sure you're familiar with the passage from the Haskell Report which I quote, "[Haskell] should be suitable for teaching, research, and applications, including building large systems." Notice that "teaching" is the first stated goal, followed by research, with applications coming in third.
I think we'd be losing sight of larger goals if we were to allow an argument from premature optimization to affect our decisions here when the suggested addition serves both an industrially-pragmatic and pedagogically-sound purpose.
Ok.
Let's have toStrictText
and toStrictTextWith
.
I'd also welcome a similar PR to bytestring
, though I'm not a maintainer of that library. ping @sjakobi
I'd also welcome a similar PR to
bytestring
, though I'm not a maintainer of that library. ping @sjakobi
Another argument against this addition is the good old Fairbairn threshold.
I currently feel about -1/2 on these additions.
The other bytestring maintainers might disagree though. CC @Bodigrim, @hsyl20, @bgamari, @chessai, @dcoutts, @hvr.
As a maintainer of bytestring
, I don't see much point in adding trivial, type-driven plumbing of this kind. GHC should be perfectly capable to suggest toStrict
instead of a hole.
From my point of view the real annoyance here is that Data.Text
does not export toStrictText
, so one has to jump to the import list and add Data.Text.Lazy
(with some qualified namespace). Could we possibly export toStrictText
/ fromStrictText
from Data.Text
?
Two opportunities I think this function affords:
- It is a good place to attach documentation. If you search Hoogle for
Builder -> Text
, this function can come up, and it can include @phadej's warning about the circumstances in which one would be better off preferring lazy text to avoid copying. - Although I unconfidently implemented this function as the trivial composition
toText = L.toStrict . toLazyText
, @Bodigrim, I was hoping that this simple starting point might eventually give way to a better definition. I'm not sure it's actually necessary to construct the intermediate lazyText
value; instead of passing the chunk list toData.Text.Lazy.fromChunks
(in definition oftoLazyTextWith
), perhaps we could applyData.Text.concat
to it directly? I am sympathetic to Fairbairn, @sjakobi, but adding more combinations to the library can often give the library authors more ability to give us a more performant composition.
I am in favor of adding this function, agreeing with @chris-martin's motivations above, and particularly:
it is difficult to explain why among the three major types (strict text, lazy text, and builder) only five of the six possible conversion functions are given in the library.
- It's worth adding some comment that it does extra copying so users who care about that can think twice about it.
- I also think the more explicit name
toStrictText
is better.
@haskell/text any objections?
As long as there is no expectation for bytestring
to follow suit, I don't object.
I'd be happy to include the function. Even if we immediately throw a WARNING
or DEPRECATION
pragma and suggest something better, it's always good to have easily discoverable names that we can attach documentation to.
@Lysxia if you are still in favor of the idea, feel free to merge.