pprint icon indicating copy to clipboard operation
pprint copied to clipboard

hardlines at the end of nest blocks

Open Alasdair opened this issue 1 year ago • 5 comments

Hi,

I've been working on a formatter for a language with C style line comments, and I've ran into an issue with the behavior of hardline and nest that is quite tricky to work around, the problem is if you have:

nest n (x ^^ hardline) ^^ y

then the hardline immediately outputs spaces to the nest n level causing the indentation to 'leak' out of the nest block and indent the first line of y. Obviously in the simple case this can be worked around by writing nest n x ^^ hardline ^^ y instead, but the problem is if the hardline is generated by some other function like:

let line_comment contents = string "//" ^^ string contents ^^ hardline

and it occurs in a situation like nest n (x ^^ y) ^^ z, where y eventually ends in a line comment after potentially many calls to other formatting functions. Trying to avoid this would require re-writing my formatter in a rather contorted way.

It would be more useful to me if there was a linebreak combinator that satisfies the property:

nest n (x ^^ linebreak) ^^ y == nest x ^^ linebreak ^^ y

I was able to achieve this by modifying PPrintEngine like so: https://github.com/Alasdair/pprint/commit/bac74641ea5487090ef3e12708182bf89af3bb59 such that continue takes a pending hardline parameter and only prints the spaces when it encounters KCons. Modifying the behavior of hardline itself might be a breaking change for other users (and I am not sure if there are unintended consequences of that change), so perhaps a separate combinator would be better?

Best, Alasdair

Alasdair avatar Mar 16 '23 17:03 Alasdair

Hello Alasdair. Thanks for this suggestion. I definitely do not want to change the semantics of the existing hardline combinator, but adding a new combinator with a different semantics could make sense, provided I can convince myself that I understand exactly what the new semantics is supposed to be.

Currently the semantics of nest is that it adds indentation to every hardline combinator in its scope. (The indentation is printed immediatley after the newline character.) I am not sure how you would describe the semantics of the new combinator? The equation that you write above suggests that it is not affected by nest, but I am guessing that it must still somehow be affected by nest, otherwise you would get zero indentation.

Thinking out loud, perhaps one could propose an unnest combinator which locally undoes the effect of the innermost nest combinator; then your linebreak might be sugar for unnest hardline. But I am not sure whether that would be very robust; some people might want to escape multiple nest combinators, and it may be difficult to keep track of exactly how many nest combinators one wishes to escape.

fpottier avatar Mar 16 '23 20:03 fpottier

I have solved this issue by change the semantic of the nest. By change it from

every time a newline character is emitted, it is immediately followed by n blank characters, where n is the current indentation level.

to

every time a character is emitted at the begining of a line, it is immediately preceded by n blank characters, where n is the current indentation level.

This issue is elegantly solved.

hackwaly avatar Jun 04 '23 22:06 hackwaly

Oh, I rather got distracted by other things and forgot a bit about this issue - apologies.

I came up with my own solution where I used the range combinator to track the output location of each hardline, then I used a postprocessing step to 'fix' the line breaks in a separate pass. It feels like a bit of a hack, but this let me implement a few more useful newline combinators like a newline that disappears when followed by another newline (which I use after line comments, to ensure they are followed by a line break, but never introduce more than strictly required).

Alasdair avatar Jun 04 '23 23:06 Alasdair

@hackwaly, this sounds like an interesting suggestion.

I suppose you mean: "every time a nonblank character is emitted at the begining of a line, it is immediately preceded by n blank characters, where n is the current indentation level".

I would be interested in seeing your code, if possible.

This is a global change in the semantics, so, in order to not to break existing code, it would have to be a new mode, which users would have to explicitly request.

fpottier avatar Jun 05 '23 08:06 fpottier

@fpottier https://github.com/hackwaly/pprint/commit/ca777bf60c05b4c94f8a42863b23e4e466b27cc7#diff-a126b196d27ba148c5b7e41ae38d5771e04cdbf7972ed5b6abd0ac2cd8851a55R574

hackwaly avatar Jun 08 '23 03:06 hackwaly