jfx icon indicating copy to clipboard operation
jfx copied to clipboard

JDK-8314215 Trailing Spaces before Line Breaks Affect the Center Alignment of Text

Open hjohn opened this issue 2 years ago • 83 comments
trafficstars

There are a number of tickets open related to text rendering:

https://bugs.openjdk.org/browse/JDK-8314215

https://bugs.openjdk.org/browse/JDK-8145496

https://bugs.openjdk.org/browse/JDK-8129014

They have in common that wrapped text is taking the trailing spaces on each wrapped line into account when calculating where to wrap. This looks okay for text that is left aligned (as the spaces will be trailing the lines and generally aren't a problem, but looks weird with CENTER and RIGHT alignments. Even with LEFT alignment there are artifacts of this behavior, where a line like AAA BBB CCC (note the double spaces) gets split up into AAA , BBB and CCC, but if space reduces further, it will wrap too early because the space is taken into account (ie. AAA may still have fit just fine, but AAA doesn't, so the engine wraps it to AA + A or something).

The fix for this is two fold; first the individual lines of text should not include any trailing spaces into their widths; second, the code that is taking the trailing space into account when wrapping should ignore all trailing spaces (currently it is ignoring all but one trailing space). With these two fixes, the layout in LEFT/CENTER/RIGHT alignments all look great, and there is no more early wrapping due to a space being taking into account while the actual text still would have fit (this is annoying in tight layouts, where a line can be wrapped early even though it looks like it would have fit).

If it were that simple, we'd be done, but there may be another issue here that needs solving: wrapped aligned TextArea's.

TextArea don't directly support text alignment (via a setTextAlignment method like Label) but you can change it via CSS.

For Left alignment + wrapping, TextArea will ignore any spaces typed before a line that was wrapped. In other words, you can type spaces as much as you want, and they won't show up and the cursor won't move. The spaces are all getting appended to the previous line. When you cursor through these spaces, the cursor can be rendered out of the control's bounds. To illustrate, if you have the text AAA BBB CCC, and the text gets wrapped to AAA, BBB, CCC, typing spaces before BBB will not show up. If you cursor back, the cursor may be outside the control bounds because so many spaces are trailing AAA.

The above behavior has NOT changed, is pretty standard for wrapped text controls, and IMHO does not need further attention.

Now, for RIGHT alignment, the new code does change things a bit. Where before the single trailing space that was not collapsed was shown at the end of each wrapped line, this space is now gone. Typing further spaces after AAA will not show up, but they are still there, and moving the cursor through them will move the cursor beyond the right side of the control's bounds (just like LEFT alignment has such a case). Some Text rendering implementations will "clamp" the cursor to prevent this (browsers), while others will render the cursor in the margin (but eventually the cursor disappears if there is not enough margin and there are too many spaces).

Personally, I think this is absolutely fine. Wrapped text controls all have this kind of behavior, where typing a space just before the wrapping point seemingly has no effect until a non-space character is typed.

For CENTER alignment, the case is similar to LEFT alignment, except there is a bit less space available to render "invisible" spaces.

Screenshots

Using Labels containing the text AAA BBB CCC (note the doubled space between words)

First wrapping point

Before

image Note the odd trailing space in Center and Right alignments.

After

image

Small but text would still fit

Before

image Note how the wrapping occurs earlier than needed, leading to "empty" lines that contained invisible spaces

After

image Note how everything still fits


Progress

  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue
  • [ ] Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8314215: Trailing Spaces before Line Breaks Affect the Center Alignment of Text (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1236/head:pull/1236
$ git checkout pull/1236

Update a local copy of the PR:
$ git checkout pull/1236
$ git pull https://git.openjdk.org/jfx.git pull/1236/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1236

View PR using the GUI difftool:
$ git pr show -t 1236

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1236.diff

Webrev

Link to Webrev Comment

hjohn avatar Sep 10 '23 20:09 hjohn

:wave: Welcome back jhendrikx! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Sep 10 '23 20:09 bridgekeeper[bot]

@prrace I've done an attempt to make a fix for this behavior, if you could take a look and let me know what you think, that'd be great

hjohn avatar Sep 10 '23 21:09 hjohn

@kevinrushforth I think this PR may be worth consideration, although I am not 100% sure that I haven't missed anything, the examples look a lot better with this fix.

hjohn avatar Sep 23 '23 14:09 hjohn

Here's the program I used to make the screenshots:


import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.control.Label;
import javafx.scene.layout.GridPane;
import javafx.scene.text.Font;
import javafx.scene.text.TextAlignment;
import javafx.stage.Stage;

public class TextFlowProblem2 {
  public static void main(String[] args) {
    Application.launch(App.class);
  }

  public static class App extends Application {

     /**
      * @param args the command line arguments
      */
     public static void start(String[] args) {
       Application.launch(args);
     }

    @Override
    public void start(Stage stage) {
      Label label1 = new Label("AAA  BBB  CCC");

      label1.setFont(Font.font("courier new", 60));
      label1.setWrapText(true);
      label1.setTextAlignment(TextAlignment.LEFT);
      label1.setStyle("-fx-border-color: red;");

      Label label2 = new Label("AAA  BBB  CCC");

      label2.setFont(Font.font("courier new", 60));
      label2.setWrapText(true);
      label2.setTextAlignment(TextAlignment.CENTER);
      label2.setStyle("-fx-border-color: red;");

      Label label3 = new Label("AAA  BBB  CCC");

      label3.setFont(Font.font("courier new", 60));
      label3.setWrapText(true);
      label3.setTextAlignment(TextAlignment.RIGHT);
      label3.setStyle("-fx-border-color: red;");

      GridPane gridPane = new GridPane();

      Label alignmentLabel1 = new Label("Left Aligned");
      Label alignmentLabel2 = new Label("Center Aligned");
      Label alignmentLabel3 = new Label("Right Aligned");

      alignmentLabel1.setMinWidth(100);
      alignmentLabel2.setMinWidth(100);
      alignmentLabel3.setMinWidth(100);

      gridPane.addRow(0, alignmentLabel1, label1);
      gridPane.addRow(1, alignmentLabel2, label2);
      gridPane.addRow(2, alignmentLabel3, label3);

      stage.setScene(new Scene(gridPane));

      stage.setWidth(1024);
      stage.setHeight(1024);
      stage.show();
    }
  }
}

hjohn avatar Sep 23 '23 14:09 hjohn

@prrace Can you take a look?

kevinrushforth avatar Sep 30 '23 17:09 kevinrushforth

/reviewers 2

kevinrushforth avatar Oct 27 '23 16:10 kevinrushforth

@kevinrushforth The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

openjdk[bot] avatar Oct 27 '23 16:10 openjdk[bot]

I've tested AAA BBB CCC in Chrome, and the text is rendered pretty much exactly as your "before" version:

Center: center

Right: right

Here's the HTML:

<p style="white-space:pre-wrap; font:90pt courier new; text-align:right">AAA  BBB  CCC</p>

Edit: I just discovered that there are more white-space settings than I thought. Using white-space:pre-line the rendered text looks pretty much exactly like your "after" version. Maybe we also need this toggle in JavaFX?

mstr2 avatar Oct 30 '23 05:10 mstr2

I've tested AAA BBB CCC in Chrome, and the text is rendered pretty much exactly as your "before" version:

For the current rendering that JavaFX does, there is no browser equivalent, although pre-wrap comes close, JavaFX will take the spaces into account when adding the ellipsis (in HTML can test with overflow: hidden; text-overflow: ellipsis;) and displays odd empty lines in between the cut-off lines (see the 2nd before picture). I suppose this could also be considered a separate bug.

For browsers, pre-line is the normal equivalent that preserves spaces; this probably should be the default setting for FX as spaces are considered significant in FX controls. This what this PR provides.

Although I definitely think the pre-line (or normal but preserving white space) rendering should have been standard from the beginning, this could be provided with a CSS style (but that will need a default as well, and the current one isn't ideal as default IMHO). I'm however not sure if that's really worth it as JavaFX labels and text fields are quite different from browser elements. I'm unaware of other UI toolkits offering such settings.

hjohn avatar Oct 30 '23 08:10 hjohn

I tested the behavior of MS Word. The text is AAA BBB CCC.

Left aligned: image

Center aligned: image

Right aligned: image

nlisker avatar Oct 30 '23 12:10 nlisker

From https://www.unicode.org/reports/tr14-4/

5.6 Break opportunity after characters (A)
Breaking Spaces
SPACE (SP) � U+0020

The space characters are explicit break opportunities, but spaces at the end of a line are not measured for fit. If there is a sequence of space characters, and breaking after any of the space characters would result in the same visible line, the line breaking position after the last space character in the sequence is the locally most optimal one. In other words, since the last character measured for fit is BEFORE the space character, any number of space characters are kept together invisibly on the previous line and the first non-space character starts the next line.

It is sometimes convenient to use SP, but not the other breaking spaces to override context based behavior of other characters under the "anywhere, except where prohibited" style of line breaking (context analysis style 2).

EN QUAD � U+2000
EM QUAD � U+2001
EN SPACE � U+2002
EM SPACE � U+2003
THREE-PER-EM SPACE � U+2004
FOUR-PER-EM SPACE � U+2005
SIX-PER-EM SPACE � U+2006
PUNCTUATION SPACE � U+2008
THIN SPACE � U+2009
HAIR SPACE � U+200A

The preceding list of characters all have a specific width, but behave otherwise as breaking spaces .

ZERO WIDTH SPACE (ZWSP) � U+200B

This character does not have width. It is used in a style 2 context analysis to provide additional (invisible) break opportunities.

IDEOGRAPHIC SPACE � U+3000

This character has the width of an ideograph but like ZWSP is fully subject to the style 2 context analysis.

A quick check with (the latest?) MS Word 2208 on Windows shows that, at least with EN QUAD U+2000 it is treated as a regular character (i.e. it is always "displayed" even if right aligned).

andy-goryachev-oracle avatar Oct 30 '23 19:10 andy-goryachev-oracle

I've tested AAA BBB CCC in Chrome, and the text is rendered pretty much exactly as your "before" version:

For the current rendering that JavaFX does, there is no browser equivalent, although pre-wrap comes close, JavaFX will take the spaces into account when adding the ellipsis (in HTML can test with overflow: hidden; text-overflow: ellipsis;) and displays odd empty lines in between the cut-off lines (see the 2nd before picture). I suppose this could also be considered a separate bug.

For browsers, pre-line is the normal equivalent that preserves spaces; this probably should be the default setting for FX as spaces are considered significant in FX controls. This what this PR provides.

Although I definitely think the pre-line (or normal but preserving white space) rendering should have been standard from the beginning, this could be provided with a CSS style (but that will need a default as well, and the current one isn't ideal as default IMHO). I'm however not sure if that's really worth it as JavaFX labels and text fields are quite different from browser elements. I'm unaware of other UI toolkits offering such settings.

It seems there's no browser equivalent to your proposed behavior, either. pre-line does not preserve spaces, but your proposal does; it merely doesn't preserve end-of-line spaces.

While that might be a sensible default behavior, I'm not sure about the "invisible input" behavior that you'd get if you entered spaces at the end of a line.

mstr2 avatar Oct 30 '23 19:10 mstr2

It seems there's no browser equivalent to your proposed behavior, either. pre-line does not preserve spaces, but your proposal does; it merely doesn't preserve end-of-line spaces.

Yeah, I missed that somehow. pre-line looks close but collapsed inter word spaces; pre-wrap doesn't, but its wrapping behavior is different. I get the impression that pre-wrap is the one you want. It's described as "Whitespace is preserved by the browser. Text will wrap when necessary, and on line breaks", which sounds correct, apart from it also preserving some space when reflowing.

Perhaps Chrome is not doing it quite right? I tested Firefox (also Opera and some others but most of those are Chromium based), and it looks like browsers are disagreeing; Firefox is doing it differently (with pre-wrap):

image

Edit: Looks like Firefox will try to wrap too early though (it adds ellipsis too soon) so browsers don't seem to be a really authoritive source for this behavior.

While that might be a sensible default behavior, I'm not sure about the "invisible input" behavior that you'd get if you entered spaces at the end of a line.

That's done almost everywhere, just try it in the comment box here on Github for example. Type sufficient text to reach the edge of the box, then type a lot of spaces. They seem to "disappear", and they don't appear when the text reflows. If you then cursor back, the cursor won't appear until you passed all the spaces you typed; or if you break the line in some other place, all those spaces are still there. It feels odd, but I think it is not a bad way to handle this kind of thing.

The example of MS Word that Nir showed also shows that the extra spaces just go past the margin. Photoshop does the same (it's in one the tickets).

hjohn avatar Oct 30 '23 19:10 hjohn

@andy-goryachev-oracle

The space characters are explicit break opportunities, but spaces at the end of a line are not measured for fit. If there is a sequence of space characters, and breaking after any of the space characters would result in the same visible line, the line breaking position after the last space character in the sequence is the locally most optimal one. In other words, since the last character measured for fit is BEFORE the space character, any number of space characters are kept together invisibly on the previous line and the first non-space character starts the next line.

That's certainly a good description of how the breaking should be handled.

hjohn avatar Oct 30 '23 19:10 hjohn

@hjohn This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Nov 28 '23 03:11 bridgekeeper[bot]

This really deserves to be fixed. Any progress on getting this reviewed?

hjohn avatar Nov 28 '23 09:11 hjohn

Do we want to limit this feature to ASCII spaces explicitly?

andy-goryachev-oracle avatar Nov 28 '23 16:11 andy-goryachev-oracle

@hjohn This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Dec 26 '23 17:12 bridgekeeper[bot]

keep open

hjohn avatar Dec 26 '23 18:12 hjohn

@hjohn This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Jan 24 '24 01:01 bridgekeeper[bot]

Regarding other whitespace characters. Using \u2003 em space, pasting the following text

AAA    BBB    CCC    

the results as seen with MS Word 16.78 (23100802) on macOS 14.2.1 indicate that it treats em space as whitespace:

Screenshot 2024-01-25 at 08 14 13 Screenshot 2024-01-25 at 08 14 22

andy-goryachev-oracle avatar Jan 25 '24 16:01 andy-goryachev-oracle

Even if we forget other whitespace characters and use 0x20 spaces, there is a weird problem where "CCC" jumps when changing the width of the window:

Screenshot 2024-01-25 at 08 29 32 Screenshot 2024-01-25 at 08 29 38

I am testing this PR branch merged with the latest master and a slightly different test program:

    @Override
    public void start(Stage stage) {
        String text =
            //"AAA\u2003\u2003\u2003\u2003BBB\u2003\u2003\u2003\u2003CCC\u2003\u2003\u2003\u2003";
            "AAA     BBB      CCC      ";

        TextFlow t = new TextFlow(new Text(text));
        t.setStyle("-fx-font-size:200%;");
        t.setTextAlignment(TextAlignment.CENTER);
        
        BorderPane root = new BorderPane(t);
        Scene scene = new Scene(root, 595, 150, Color.BEIGE);
        
        stage.setTitle("Text Alignment");
        stage.setScene(scene);
        stage.show();
    }

andy-goryachev-oracle avatar Jan 25 '24 16:01 andy-goryachev-oracle

Even if we forget other whitespace characters and use 0x20 spaces, there is a weird problem where "CCC" jumps when changing the width of the window:

I am testing this PR branch merged with the latest master and a slightly different test program:

This is because you don't have the same number of spaces between the words (first has 5 spaces, second has 6 spaces). I don't think it should do that, so I'll look into how this is happening.

hjohn avatar Jan 25 '24 23:01 hjohn

Regarding other whitespace characters. Using \u2003 em space, pasting the following text

I think we can use Character.isWhitespace. For the most part it aligns with "is this a breaking space". It's not perfect, but we can do more when there are complaints. Here's my test:

char     isWhiteSpace (breaking space)?
\u0009   true
\u0020   true
\u00a0   false
\u1680   true
\u180e   false <- unexpected
\u2001   true
\u2002   true
\u2003   true
\u2004   true
\u2005   true
\u2006   true
\u2007   false
\u2008   true
\u2009   true
\u200a   true
\u200b   false <- unexpected
\u200c   false <- unexpected
\u200d   false <- unexpected
\u202f   false
\u205f   true
\u3000   true

hjohn avatar Jan 25 '24 23:01 hjohn

thank you @hjohn ! this PR has been out for a while, could you sync it up with the master please?

andy-goryachev-oracle avatar Jan 25 '24 23:01 andy-goryachev-oracle

@andy-goryachev-oracle thanks for testing this, what you discovered was a bug in the computeTrailingWhiteSpace code that went undetected because of the simplicity of the strings I tested with. The positions array containing all the character widths is shared amongst all TextRuns (it does mention that) but I accessed it as if it was 0 based per TextRun. This means it sometimes included the width of characters that weren't spaces at all...

hjohn avatar Jan 26 '24 01:01 hjohn

char     isWhiteSpace (breaking space)?
\u180e   false <- unexpected
\u200b   false <- unexpected
\u200c   false <- unexpected
\u200d   false <- unexpected

I suspect this result is correct: \u180e mongolian vowel separator \u200b false <- zero-width space \u200c false <- zero-width non-joiner (ZWNJ) \u200d false <- The right-to-left mark (RLM)

For instance, https://unicode-explorer.com/c/180E It is no longer classified as space character (i.e. in Zs category) in Unicode 6.3.0, even though it was in previous versions of the standard.

Here is the (non-authoritative) list: https://www.fileformat.info/info/unicode/category/Zs/list.htm

Anyway, we probably can rely on Character.isWhitespace() being correct.

andy-goryachev-oracle avatar Jan 26 '24 15:01 andy-goryachev-oracle

@hjohn I see some weird behavior with the text string copied from https://github.com/openjdk/jfx/pull/1236#issuecomment-1910538102

See here:

Screenshot 2024-01-26 at 10 39 20

I've updated the MonkeyTester https://github.com/andy-goryachev-oracle/MonkeyTest

to have the text alignment and text bounds type properties - do you think you could try using it to test?

For example,

  • should the underline/strike through properties affect the handling of whitespace?
  • does the updated code handle TextBoundsType.VISUAL and .LOGICAL_VERTICAL_CENTER correctly?

andy-goryachev-oracle avatar Jan 26 '24 18:01 andy-goryachev-oracle

Also, the description of this PR mentions two other tickets

https://bugs.openjdk.org/browse/JDK-8145496 https://bugs.openjdk.org/browse/JDK-8129014

are these duplicates or do they describe similar, but legally different scenarios? if these are duplicates, they should probably be added to this PR.

andy-goryachev-oracle avatar Jan 26 '24 18:01 andy-goryachev-oracle