typstfmt icon indicating copy to clipboard operation
typstfmt copied to clipboard

[bug] Text wrapping: broken list case; 2x indent; length overflow

Open Andrew15-5 opened this issue 2 years ago • 48 comments

This is a separate issue, related to the #74, since the text wrapping itself kinda works. Now it's time for bugs!

Before:
We have next things:
- thing 1;
- thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim aeque doleamus animo, cum corpore dolemus, fieri.]

With default:

indent_space = 2
max_line_length = 50
experimental_args_breaking_consecutive = false

And version: 0.2.0#578fabc

After 1st run:
We have next things: - thing 1;
- thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]
After 2nd run:
We have next things: - thing 1; - thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]
After 3rd run:
We have next things: - thing 1; - thing 2; - thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]
After 4th run:
We have next things: - thing 1; - thing 2; - thing
3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]

Next run will be as in 3rd, after as in 4th and so on.

But with max_line_length = 80 it stops formatting after 3rd run.

Note that the text lines exceed the 50 mark, which is also wrong (same with 80).

What's wrong:

  • [x] merge of list lines with non-list lines if they are on adjacent lines (it could be //-/+/\d+\.)
    • [x] in 0.2.1#1817538 version -/+/\d+\. are fixed
    • [x] in #83 / (terms) also got fixed
  • [x] it takes 2 runs to add 2 indents to the content block with a text (which is a function argument) — fixed at least in 0.2.1#1817538
  • [ ] it adds 2 indents to the content block with a text (which is a function argument)
  • [ ] some lines of text can exceed the limit from the config (file) — discussed in #86

How it supposed to look like:

We have next things:
- thing 1;
- thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.]

But since the function name could be very long, it should format like this (either only for long function names or for all cases):

We have next things:
- thing 1;
- thing 2;
- thing 3.

#very-long-long-long-long-long-function-name(
  [Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.]
)

This last one can be debatable, I assume, but the main problem, is that Typst right now can't do something like this (shell):

var="\
first line
secondline"

Typst instead of one additional space will add a new paragraph if \ is used right after [ (same with ]). So, to line up all the content block lines, we can use otherwise redundant () to wrap the content block.

Note that if the function name is very long, it could exceed the max line limit, but we can't do anything about it (we can't cut it in half).

Andrew15-5 avatar Aug 31 '23 12:08 Andrew15-5

Thanks again for your helpful issues and uncovering the cursed fruit of my work !

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]

is stable and pretty good looking, i don't see a problem with it. The list && enum thing is bad tho

astrale-sharp avatar Aug 31 '23 13:08 astrale-sharp

I think you've missed some of my points, specifically:

  • you show 1 indent, which is generally ok, but in my case it adds 2 indents after 2 runs! Formatter always has to format everything on the 1st run.
  • I noted that some function names could be very long, and you have to keep [ and the content inside together, as to not introduce unwanted space. Therefore, if the function name is long enough, you just have to add () around [] to move content block on a new line (examples are in the first comment).

These are the main problems right now with version 0.2.0#578fabc when it comes to content block formatting.

Andrew15-5 avatar Aug 31 '23 14:08 Andrew15-5

I indeed missed that first point! I might still be missing the second one tbh

astrale-sharp avatar Aug 31 '23 14:08 astrale-sharp

I can't reproduce #f[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim aeque doleamus animo, cum corpore dolemus, fieri.] , are you sure it's still a thing on master?

astrale-sharp avatar Aug 31 '23 14:08 astrale-sharp

I've left tested version, which is indeed an old version right now. I'll update and test again.

Andrew15-5 avatar Aug 31 '23 15:08 Andrew15-5

I indeed missed that first point! I might still be missing the second one tbh

This is default behavior on 0.2.1#1817538:

#very-long-long-long-long-long-function-name[Lorem ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]

As I already said, it should be something like this:

#very-long-long-long-long-long-function-name(
  [Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.]
)

Because next formatting style will (could) introduce a space at the beginning:

#very-long-long-long-long-long-function-name[// Here an (un)wanted whitespace will appear.
  Lorem ipsum dolor sit amet, consectetur // Even if this line is not indented.
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.] // (1)
] // (2)

I prefer ] to be like (2), because it looks more like a block/function call, rather than (1).

Andrew15-5 avatar Aug 31 '23 15:08 Andrew15-5

I finally get it, you would want me to modify the ast in that manner! I don't know How I feel about that I would want this to happen

                                                                                     ﹀ not breaking here cause that modifies the content which is forbidden
#very-long-long-long-long-long-function-name[Lorem // breaking here cause it's over max length
  ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]

astrale-sharp avatar Aug 31 '23 15:08 astrale-sharp

But that's not possible cause the child is not aware of the parent, what you propose is sensible, I'll think about it

astrale-sharp avatar Aug 31 '23 15:08 astrale-sharp

I fixed the other one btw

astrale-sharp avatar Aug 31 '23 15:08 astrale-sharp

I fixed the other one btw

I've just written a big-ish comment, and you've replied to my previous comments (probably). So you have to be more specific about "fixed the other one". I don't know what you are referring to.

Update: I've checked new commits. You were talking about list issue. I'll check it out.

Andrew15-5 avatar Aug 31 '23 15:08 Andrew15-5

Sorry XD

astrale-sharp avatar Aug 31 '23 15:08 astrale-sharp

All lists except terms are now working (in simple cases, I haven't tried anything complex).

Andrew15-5 avatar Aug 31 '23 15:08 Andrew15-5

I also fixed

    #very-long-long-long-long-long-function-name(
  [Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor]
)

becoming

    #very-long-long-long-long-long-function-name([Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor])

astrale-sharp avatar Aug 31 '23 15:08 astrale-sharp

Let's look at terms now

astrale-sharp avatar Aug 31 '23 15:08 astrale-sharp

Your latest #56fbba6 broke your "1 indent" fix. Now it again adds 2 indents but in one run! This is progress with one step back.

Andrew15-5 avatar Aug 31 '23 15:08 Andrew15-5

I totally forgot about terms, right now they're a bit broken If I try to apply the same rules as enums and list for some unknown reason:tm:

Your latest #56fbba6 broke your "1 indent" fix. Now it again adds 2 indents but in one run! This is progress with one step back.

I don't know what you're talking about

astrale-sharp avatar Aug 31 '23 15:08 astrale-sharp

I don't know what you're talking about

Last commit again introduced the initial issue:

#f[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim aeque doleamus animo, cum corpore dolemus, fieri.]
#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]

2 indents (in 1 run) instead of 1.

Andrew15-5 avatar Aug 31 '23 16:08 Andrew15-5

I think there's one level of indent for the func call and one for the content block, wouldn't you say it makes sense? if we wrote

#f([ .. ])

it might format as

#f(
  [
     .. 
  ]
)

would you say this deserves a special case and it should only be one level of indent?

astrale-sharp avatar Aug 31 '23 16:08 astrale-sharp

I thought TermItems were broken but I discovered a bug! (I was stuck on it for a while), I'll open another issue and ping you

astrale-sharp avatar Aug 31 '23 16:08 astrale-sharp

I think there's one level of indent for the func call and one for the content block, wouldn't you say it makes sense? if we wrote

#f([ .. ])
it might format as

#f(
  [
     .. 
  ]
)

Yes, that would make sense, if [] wasn't \n/ sensitive. There are a lot of cases when you have to write content inside content block like this: [any content], instead of [ any content ] or [\n any content\n], because they can produce different output! So, you can't add another indent inside/for a content block, at least in Typst v0.7.

would you say this deserves a special case and it should only be one level of indent?

Since [] is sensitive to newlines and whitespaces it becomes a non-special case (like normal arguments with values, or just values). Therefore, it should be indented as any normal argument: 1 indent.

If [] were to be a {} then it's a completely different thing. Right now, you can't write #f{sym.RR}, you have to wrap {} inside () like this: #f({sym.RR}). But this already has a good formatting:

// If was written in one line
#f({ sym.RR })

// If was written differently
#f({
  sym.RR
})

Andrew15-5 avatar Aug 31 '23 17:08 Andrew15-5

Content blocs are space sensitive but not indent sensitive :sunglasses: [a] to [ a ] isn't okay but [ a ] to [ a ] is as well as the current example ;) You may add as many indents as you wish but no Parbreak (more than two \n\n)

astrale-sharp avatar Aug 31 '23 17:08 astrale-sharp

[ a ] to [ a ] is

It's literally the same thing, nothing is changed. I don't understand what this can prove. Also, please use inline code formatting in Markdown, it improves readability of your examples.

Andrew15-5 avatar Aug 31 '23 17:08 Andrew15-5

it shows as is but the left one has 4 spaces total and the right one has 2 spaces total, I don't know why it doesn't show on github EDIT surrounded by ` it shows

astrale-sharp avatar Aug 31 '23 17:08 astrale-sharp

Because Markdown is the same as LaTeX/Typst, it ignores multiple whitespaces no matter how many there are of them.

Andrew15-5 avatar Aug 31 '23 17:08 Andrew15-5

My point exactly then, are you convinced? knowing this which do you think is the prettiest, one or two indents in this case?

astrale-sharp avatar Aug 31 '23 17:08 astrale-sharp

Content blocs are space sensitive but not indent sensitive

It's not indent sensitive because you already introduced the first whitespace by inserting a newline character:

a#f[some content]b

a#f[
some content]b

These 2 already will produce different output, but after formatting:

a #f[some content]b

a #f[
  some content
]b

They will output the same output, but not the one that is desired.

In the first case, it adds a new whitespace (no one asked), which is very bad! We need to make a new issue for this one.

In the second case, it, again, adds a new whitespace before the function call, then a \n and 2 more whitespaces after that for a total of 4 chars. This combined also results in 1 whitespace.

This is a pretty valid use case and the formatter alters the output, which should be fixed right away.

Andrew15-5 avatar Aug 31 '23 17:08 Andrew15-5

the program doesn't break the line in the cases you mentioned, it only modifies the space if it's there #[lore eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem] becomes

#[lore
  eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem]

astrale-sharp avatar Aug 31 '23 17:08 astrale-sharp

the program doesn't break the line in the cases you mentioned

"Doesn't break the line"? Which line are you talking about? The whole https://github.com/astrale-sharp/typstfmt/issues/82#issuecomment-1701477406 was about introduction of an additional whitespace if there is some text next to a function call. Which supposed to counter your point in https://github.com/astrale-sharp/typstfmt/issues/82#issuecomment-1701467237. Look at my examples in https://github.com/astrale-sharp/typstfmt/issues/82#issuecomment-1701477406 more carefully and check the output in PDF.

Andrew15-5 avatar Aug 31 '23 17:08 Andrew15-5

To me

a#f[some content]b

becomes

a #f[some content]b

I didn't understand you were talking about that space, i thought you were talking about a linebreak elsewhere, indeed that's a problem yea, although not hard to fix

astrale-sharp avatar Aug 31 '23 17:08 astrale-sharp

Now, that you understood what I was talking about, think again about newline after [ and before ]:

#f([ .. ])

it might format as

#f(
  [
     .. 
  ]
)

These 2 will produce different output:

a#f(
  [..]
)b

a#f(
  [
     .. 
  ]
)b

And this is exactly why content block should be treated as any other simple argument with a value — 1 indent.

Andrew15-5 avatar Aug 31 '23 18:08 Andrew15-5