text icon indicating copy to clipboard operation
text copied to clipboard

Add Data.Text.Lazy.Builder.toText

Open chris-martin opened this issue 3 years ago • 11 comments

This is a trivial composition, but it has been a recurring inconvenience for me for years. Particularly when I'm writing introductory tutorials, it is difficult to explain why among the three major types (strict text, lazy text, and builder) only five of the six possible conversion functions are given in the library.

chris-martin avatar Aug 25 '20 00:08 chris-martin

But there is a reason. If you make a Builder, then convert to lazy Text and then again to strict Text you are most likely doing some completely unnecessary copying, instead of directly doing thing you want to do with lazy Text.

I'm not convinced it is a good idea to have this function. If you really need that conversion, write it explicitly in your code, and wonder whether you can afford the conversion. Hiding this fact, especially in introductionary tutorials, is not good idea IMO.

The same reasoning there isn't way to convert bytestring's Builder to strict ByteString.

phadej avatar Aug 25 '20 06:08 phadej

Perhaps then should Data.Text.Lazy.toStrict have a warning on it, or perhaps even be deprecated, for the same reason? Or does the principle that lazy text ought to be used directly, rather than be made strict, only apply to lazy text that was produced using Builder?

chris-martin avatar Aug 25 '20 08:08 chris-martin

I frequently need this function, as well. When I need it, I don't normally find myself thinking, "I wonder if I can afford the conversion." Instead, I normally find myself thinking "Dammit, why does this library function (or record) take strict Text? I guess I'd better stop what I'm doing, add an import to the top of my file, and do the conversion." Why am I using such terrible libraries, you might ask? Well, the function had to take something and strict Text is just what they decided.

To me, the above reasoning behind not including the one missing conversion is flimsy at best and at worst creates a sliding goalpost. As already pointed out above, less-than-optimal functions already abound. Why is the policy on admitting only the most optimal functions not uniformly followed? The answer is that it'd make for a library that nobody would want to use.

Finally, while I understand the passion for conceptual understanding that motivates opinions such as, "Hiding this fact, especially in introductionary tutorials, is not good idea IMO," I am obliged to point out that the literature disagrees with you. Identification of learners' zone of proximal development combined with careful scaffolding is an important component in lesson design, even for adult learners (https://journals.sagepub.com/doi/full/10.3102/0034654316670999). All this amounts to not overwhelming the learner with too much information at once. I'm sure you're familiar with the passage from the Haskell Report which I quote, "[Haskell] should be suitable for teaching, research, and applications, including building large systems." Notice that "teaching" is the first stated goal, followed by research, with applications coming in third.

I think we'd be losing sight of larger goals if we were to allow an argument from premature optimization to affect our decisions here when the suggested addition serves both an industrially-pragmatic and pedagogically-sound purpose.

friedbrice avatar Aug 25 '20 16:08 friedbrice

Ok.

Let's have toStrictText and toStrictTextWith.


I'd also welcome a similar PR to bytestring, though I'm not a maintainer of that library. ping @sjakobi

phadej avatar Aug 25 '20 17:08 phadej

I'd also welcome a similar PR to bytestring, though I'm not a maintainer of that library. ping @sjakobi

Another argument against this addition is the good old Fairbairn threshold.

I currently feel about -1/2 on these additions.

The other bytestring maintainers might disagree though. CC @Bodigrim, @hsyl20, @bgamari, @chessai, @dcoutts, @hvr.

sjakobi avatar Aug 25 '20 17:08 sjakobi

As a maintainer of bytestring, I don't see much point in adding trivial, type-driven plumbing of this kind. GHC should be perfectly capable to suggest toStrict instead of a hole.

From my point of view the real annoyance here is that Data.Text does not export toStrictText, so one has to jump to the import list and add Data.Text.Lazy (with some qualified namespace). Could we possibly export toStrictText / fromStrictText from Data.Text?

Bodigrim avatar Aug 25 '20 17:08 Bodigrim

Two opportunities I think this function affords:

  1. It is a good place to attach documentation. If you search Hoogle for Builder -> Text, this function can come up, and it can include @phadej's warning about the circumstances in which one would be better off preferring lazy text to avoid copying.
  2. Although I unconfidently implemented this function as the trivial composition toText = L.toStrict . toLazyText, @Bodigrim, I was hoping that this simple starting point might eventually give way to a better definition. I'm not sure it's actually necessary to construct the intermediate lazy Text value; instead of passing the chunk list to Data.Text.Lazy.fromChunks (in definition of toLazyTextWith), perhaps we could apply Data.Text.concat to it directly? I am sympathetic to Fairbairn, @sjakobi, but adding more combinations to the library can often give the library authors more ability to give us a more performant composition.

chris-martin avatar Aug 25 '20 19:08 chris-martin

I am in favor of adding this function, agreeing with @chris-martin's motivations above, and particularly:

it is difficult to explain why among the three major types (strict text, lazy text, and builder) only five of the six possible conversion functions are given in the library.

  1. It's worth adding some comment that it does extra copying so users who care about that can think twice about it.
  2. I also think the more explicit name toStrictText is better.

@haskell/text any objections?

Lysxia avatar May 12 '21 19:05 Lysxia

As long as there is no expectation for bytestring to follow suit, I don't object.

Bodigrim avatar May 12 '21 19:05 Bodigrim

I'd be happy to include the function. Even if we immediately throw a WARNING or DEPRECATION pragma and suggest something better, it's always good to have easily discoverable names that we can attach documentation to.

parsonsmatt avatar May 12 '21 22:05 parsonsmatt

@Lysxia if you are still in favor of the idea, feel free to merge.

Bodigrim avatar Feb 28 '23 18:02 Bodigrim