mdcat icon indicating copy to clipboard operation
mdcat copied to clipboard

Wrap text to column limit.

Open swsnr opened this issue 7 years ago • 12 comments

Use textwrap, perhaps, to wrap all content to the column size of the TTY.

Will perhaps be tricky with termion formatting characters.

swsnr avatar Jan 07 '18 12:01 swsnr

We can't directly wrap text (eg, w/ textwrap), because we need to account for formatting escapes and for indentation.

But we can use unicode_width to compute the length of text as we write it, and then keep track of the current column being written to. If writing would exceed the column limit we can scan backwards for the first whitespace before column limit, wrap the text and try again until the entire text is written.

swsnr avatar Aug 26 '18 12:08 swsnr

That would be a great to have. E.g. in cases where links get transferred to the bottom of the document the shorter lines are very disturbing.

Just give me 3 or 4 months to learn a little Rust (I really want that feature ;) )

agschaid avatar Sep 15 '20 16:09 agschaid

consolemd does, indeed uses the python equivalent of textwrap: https://github.com/kneufeld/consolemd/commit/6a2c6eecb9ab0589f0a7c8e3db55192e6cc06ae1 which means it'll occasionally go under the provided wrap limit, but, eh it's good enough as a start

igalic avatar Sep 15 '20 18:09 igalic

@agschaid Take all the time you need; I don't think I'll fix this anytime soon.

@igalic I don't think that's good enough for me. It ought to be done right.

swsnr avatar Sep 15 '20 20:09 swsnr

i understand, @lunaryorn!

i suppose the rust way would be to create a wrapper type that can represent the output on the console, but can also be used as a String or &str for feeding into textwrap

igalic avatar Sep 15 '20 21:09 igalic

I believe there is no need to create a wrapper type as textwrap supports ANSI escape codes for a few months now (https://github.com/mgeisler/textwrap/pull/179). I don't know how well that works with windows though.

orasunis avatar Sep 18 '20 13:09 orasunis

@orasunis I guess that'd be a good start. I haven't really looked at it, but if it handles colours, OSC 8 hyperlinks and iterm marks well, it might even be a perfect solution :slightly_smiling_face:

swsnr avatar Sep 18 '20 16:09 swsnr

just to put my "commitment" into perspective: I have two kids. So my "3 or 4 months" can quickly grow into "2 or 3 years" ;)

I think this is a good project/motivation for me to get a little into rust. But don't count on me.

agschaid avatar Sep 21 '20 09:09 agschaid

@agschaid There's no commitment here. We all do this in our free time, we all have a life and priorities :slightly_smiling_face:

swsnr avatar Sep 21 '20 09:09 swsnr

Hi all, I saw a link to this on https://github.com/mgeisler/textwrap/pull/179 -- just wanted to say that textwrap will indeed ignore all ANSI escape sequences since version 0.12. Ignoring means "they don't contribute to the string width", so the wrapping computations are not affected by the escape characters any longer.

Will perhaps be tricky with termion formatting characters.

I've actually been playing around with this recently and I wrote a little demo program: https://github.com/mgeisler/textwrap/blob/master/examples/interactive.rs

It uses termion and if you modify it to use colored text, then you'll see that you can indeed very easily run into problems. Basically, textwrap::wrap will give you back a Vec of strings, complete with the original escape codes. If you just print those to the terminal everything works fine. However, my example program draws a red border around the text and so I use code like

        write!(
            stdout,
            "{}{}│{}",
            cursor::Goto(col - 1, row),
            color::Fg(color::Red),
            color::Fg(color::Reset),
        )?;

This is a problem if there is, say, blue text which was supposed to be wrapped over two lines: the color now stops at the point of the color::Fg(color::Reset) code.

Perhaps you don't have such borders and then things are easy...?

mgeisler avatar Dec 03 '20 11:12 mgeisler

ANSI sequences aren't the issue; the issue is more that pulldown is a pull parser, so we don't get the text at once but rather scattered over many different events.

swsnr avatar Dec 03 '20 13:12 swsnr

Hi @lunaryorn,

I just realized that we talked about stateful wrapping over in https://github.com/mgeisler/textwrap/issues/224 :-)

So yeah, if you get your text one piece at a time, then it'll be harder to use textwrap. I "cheated" and simply accumulated the text in https://github.com/mgeisler/textwrap/issues/140#issuecomment-418199515, but I understand that you have a different architecture in mdcat.

With https://github.com/mgeisler/textwrap/pull/234, I'm introducing a new more advanced wrapping algorithm which select optimal break points for an entire paragraph at a time — "optimal" according to some penalties which discourage short lines. This is by its nature also quite stateful. I will see if I can make the original wrapping algorithm work in an incremental fashion again.

mgeisler avatar Dec 03 '20 13:12 mgeisler