textwrap icon indicating copy to clipboard operation
textwrap copied to clipboard

Wrap multiple strings into a single paragraph (i.e. keep state while wrapping)

Open swsnr opened this issue 5 years ago • 12 comments

Hello,

I'd like to wrap paragraph text from a Markdown document parsed with pulldown_cmark.

This library feeds me a stream of events which intersperses the text of a paragraph with other events about formatting, eg:

  1. Text(Lorem ipsum dolor sit )
  2. Softwrap
  3. Text(amet, )
  4. Start(Bold)
  5. Text(consetetur)
  6. End(Bold)
  7. Text(sadipscing elitr)

To wrap all Text into a single paragraph I'd currently need to wrap each Text individually into a sequence of lines, measure the length of the last line, and then use that length (plus an extra space) as initial_indent for the next Text event, and so on…

This is certainly possible but not exactly convenient and I can't profit from textwrap's work, which already measured the length of the text.

Could textwrap provide a ParagraphWrapper which keeps state of wrapping?

swsnr avatar Nov 10 '20 21:11 swsnr

Hi @lunaryorn,

I'm not 100% sure what you're trying to do, but are you perhaps translating the events into console text? With bold and underlined text via ANSI escape codes? Then this might be useful: https://github.com/mgeisler/textwrap/issues/140#issuecomment-418199515.

mgeisler avatar Nov 12 '20 21:11 mgeisler

@mgeisler No, In fact I'm trying to wrap just the text itself because I'm already handling events differently.

I've made an example in playpen: https://play.integer32.com/?version=stable&mode=debug&edition=2018&gist=dd4ab26f4bbed27a3b432a764bded213

Does this help? Essentially I'd like to have a mutable wrapper which keeps wrapping state across invocations of wrap_iter.

swsnr avatar Nov 15 '20 12:11 swsnr

Ah, thank you for the example — that was super helpful!

I've rewritten the wrapping algorithm recently (#221) and I am wondering how the new system can be adapted to your use-case... right now the new top-level wrap function returns a Vec of strings. However, the underlying wrapping could be changed into something which is iterative instead of being single-shot.

If we imagine we have a State object with the current state, then one could imagine doing

while let Some(text) = some_source_of_text_fragments {
    state.add_text(&text);
    //

    while let Some(line) = state.next_complete_line() {
      // do something with the line
    }
}

if let Some(last) = state.last_line() {
  // do something with the incomplete final line
}

So this object would be very much like an iterator, but it would only yield complete lines — and there would then be an extra method to call to get the final incomplete line. I wrote the above with while let instead of for just to make the mechanics clear. I think the object could implement Iterator just fine — it would be the opposite of a "fused" iterator, but I don't see any harm in that.

Would such an API work for you? I don't know how to implement yet, but it should be feasible ;-)

mgeisler avatar Nov 15 '20 21:11 mgeisler

That'd do perfectly 👍

swsnr avatar Nov 16 '20 10:11 swsnr

One question, though: with such an API, you won't be able to tell where each piece of text ends in the wrapped lines.

That is, if you call add_text three times and get back two lines of wrapped text, then you don't immediately know where the second piece of text went, if you see what I mean?

Also, a piece of input text can end up spanning several wrapped lines. Conversely, each wrapped line can contain fragments from multiple pieces of input.

I think I'm saying that this gives you poor control of where your formatting goes — but perhaps that's okay for your use case?

mgeisler avatar Nov 16 '20 11:11 mgeisler

I see what you mean but I don't think that'll be an issue for me. For my purposes the amount of control "initial_indent" and "subsequent_indent" give me should be sufficient.

swsnr avatar Nov 16 '20 17:11 swsnr

For my purposes the amount of control "initial_indent" and "subsequent_indent" give me should be sufficient.

How do you intend to use those parameters? I'm imagining that they're fixed from when you create the stateful object in the beginning (before feeding in the first piece of text).

mgeisler avatar Nov 17 '20 08:11 mgeisler

They'd be fixed before each paragraph. In other words if they need to be changed I'd start a new state (or reset the state).

swsnr avatar Nov 17 '20 11:11 swsnr

Okay, it sounds like we're on the same page then with regards to an overall API. My main priority right now is to get the current rewrite released, so I'm not sure when I'll get to look at this issue. If someone wants to take a stab at it, then they're more than welcome.

mgeisler avatar Nov 21 '20 09:11 mgeisler

Coincidentally, I have combined textwrap and pulldown_cmark in a library called runwrap. It has some known limitations (see the readme) and does not handle subsequent indentation nicely in indented paragraphs.

veikman avatar Jun 06 '21 16:06 veikman

Coincidentally, I have combined textwrap and pulldown_cmark in a library called runwrap. It has some known limitations (see the readme) and does not handle subsequent indentation nicely in indented paragraphs.

Hey @veikman,

Thanks a lot for mentioning this! I hope you don't mind, but I've added a link to your library on the Rust Users Forum where someone asked for just such a thing yesterday :smile:

I'll love to hear more about your experiences with unfill, I'll create a separate issue for this.

mgeisler avatar Jun 06 '21 20:06 mgeisler

Just an update this on this from my side: I've started to work with the textwrap::core APIs, by tracking state manually and then using the core APIs to wrap text. So far it seems to work quite well so it looks as if this issue is solved for me.

swsnr avatar Sep 13 '21 18:09 swsnr