miette icon indicating copy to clipboard operation
miette copied to clipboard

Indentation with tabs throws off labels

Open surma opened this issue 4 years ago • 7 comments

(Sorry to not provide a proper repro. I’ll provide one later if you want, just wanted to get the issue opened before I forget).

If I have code indented with tabs, it seems that something goes wrong with the labelled spans:

function lol(): void {
	const x: i32 = 1;
};
Screenshot 2021-10-10 at 14 35 29

If I instead indent the code with spaces, it looks correct:

Screenshot 2021-10-10 at 14 35 35

I’m assuming that this might be due to terminal configuration how to display tabs?

surma avatar Oct 10 '21 13:10 surma

I'm not sure if there's much we can actually do about this, but I know that https://github.com/zkat/miette/issues/73 was filed before and a PR was sent in. And now we have a feature that lets you configure how wide tab characters should be? Does that solve your issue?

zkat avatar Oct 10 '21 21:10 zkat

Yeah #82 gives a (maybe unobvious) option to treat tabs as multiple spaces internally which I added exactly for the reason above.

@surma Try setting it up like this and it should render just fine.

miette::set_hook(Box::new(|_| {
    Box::new(miette::MietteHandlerOpts::new()
        .tab_width(4)
        .build())
}))

jlkiri avatar Oct 11 '21 14:10 jlkiri

Ooh that’s exactly the solution that I would have proposed: An option to convert tabs to spaces of a given width. Brilliant. Thank you.

surma avatar Oct 11 '21 15:10 surma

It's great to have this option, however I think it's still an issue that the default behavior is broken for source code indented with tabs. Relying on users of miette to be aware of the tab-alignment issue and enable tab_width themselves isn't ideal. For a bit of context, I'm currently running into this issue with nushell, which uses a near-default handler configuration.

The easiest option here would be to make tab_width = Some(4) the default. The main disadvantages of this is that it doesn't respect the tab width setting of the terminal, and that will render source code using tab characters that don't start on tabstop boundaries incorrectly.

For example, printing the following text to a terminal

col1\tc2\tcolumn4
c1\tcol2\ccol4

produces output that looks like this

col1	c2	column4
c1	col2	col4

while replacing tab characters with spaces looks like this

col1    c2    column4
c1    col2    col4

The alternate option, which doesn't have these downsides, would be to use actual tab characters to align the labels. Doing this correctly sounds somewhat tricky to get right.

I'd be up for writing a PR for either of these, and would probably prefer the latter option, but there's definitely a higher maintenance burden for that, and I don't want to implement it if the maintainers would rather go for the simpler option.

olivia-fl avatar May 25 '22 22:05 olivia-fl

Thinking about it... I think it might not actually be that tricky to use tab characters to align the start of those labels. You "just" need to scan the prevline (which iirc is already accessible where you would need it), count the number of tabs, and add that many tabs + the remaining characters in spaces, and everything should be in the right place? The trickier part, I think, is getting the underline itself to align properly.

I think I'd definitely rather have the second option here, imo, and I would love a PR for this. Happy to help if you have any questions about the code. And while you're in there, if you can come up with any brilliant ideas to address #97, even better (since this is actually related to that, tbh)

zkat avatar May 25 '22 22:05 zkat

I think it's a little bit more complicated than counting the number of tabs and sticking spaces on the end, since that will still produce different results if the tabs don't occur on tabstop boundaries.

For example:

12<tab>3

To align a label to the 3 character, the tab-counting approach gives <tab><space><space> to align the label, which would stick the label at the 10th column (on a terminal with 8-wide tabs). The 3 character actually occurs at the 8th column. I think the 100% correct approach for handling tabs is to replace every non-tab character with a space, and keep the tabs. For the above example, the label would be aligned with <space><space><tab>.

#97 looks like it already has a fix merged. Is it still an issue?

Also yeah, no good ideas about how to handle the underline yet...

I'll probably have time to work on this later today, but not sure how hard it will actually be to get right :).

olivia-fl avatar May 25 '22 23:05 olivia-fl

The fix to #97 wasn't complete. It was only for fixing the numbers in the "narratable" handler that just prints out a text explanation. It still needs to be done for the graphical handler.

zkat avatar May 25 '22 23:05 zkat