iD icon indicating copy to clipboard operation
iD copied to clipboard

Add "dir=auto" to fields with potentially bidirectional text

Open tyrasd opened this issue 7 months ago • 6 comments

Addresses #11079 by setting dir=auto to all text input fields that potentially contain bi-directional text.

Unfortunately, this comes with a significant drawback: as the text direction of an input field also affects the direction of the input field's placeholder text, dir=auto means that placeholders will always be rendered left to right. This is because the "value" of the input field itself is empty in the situation where the placeholder is to be shown, and an empty string apparently falls back to left to right. :shrug:

Apparently, this is the intended behavior of the placeholders of input fields, and it is not possible to set the placeholder's text direction independently from the input field itself, but there's a documented workaround:

In situations where the control's content has one directionality but the placeholder needs to have a different directionality, Unicode's bidirectional-algorithm formatting characters can be used in the attribute value

todo:

  • [ ] fix placeholders' orientation
  • [ ] also include address field input elements

tyrasd avatar Jun 03 '25 14:06 tyrasd

Interesting trouble with the placeholder text, I didn't think about this. I don't like the workaround suggested in that link, it wouldn't fix text alignment. I think dynamically setting the dir attribute to ltr/rtl while the input is empty is the better option. My JS is rusty but I'll play around with it in jsfiddle and post here when I have something.

NeatNit avatar Jun 03 '25 15:06 NeatNit

https://jsfiddle.net/nz0cgush/

This gives ideal behavior, in my opinion.

<input type="text" id="thing" placeholder="שלום!" dir="auto" oninput="recalc_dir(this)" onload=recalc_dir(this)>
// I hereby place this script/demo/fiddle in the public domain
// so don't worry about any copyright nonsense

function recalc_dir(o) {
  if (o.value !== "") {
    o.dir = "auto";
  } else {
    o.dir = resolve_text_direction(o.placeholder);
  }
}

function resolve_text_direction(text) {
  // run relevant part of UBA to determine the "paragraph direction" of the given text
  // https://www.unicode.org/reports/tr9/#The_Paragraph_Level
  // Look for existing implementations - might be part of bidi-js https://www.npmjs.com/package/bidi-js
  // annoying that JS/browsers don't expose this as a standard library function, as they need to implement it anyway for dir="auto".
  
  // hard-coded for demo:
  return "rtl";
}

Essentially: when there's any value in the input, it sets dir="auto" and lets the browser do its thing. When the value is empty, it sets dir="rtl" or dir="ltr" based on the placeholder text. A full implementation for determining the text direction of the placeholder text is needed. I'm not sure yet if the npm package I linked to actually has the necessary function, but even if not it'd be fairly easy to extract that information from the functions it does expose.

NeatNit avatar Jun 03 '25 16:06 NeatNit

I think dynamically setting the dir attribute to ltr/rtl while the input is empty is the better option.

Yeah, something like that would also work. I'm slightly tending towards the recommended solution with the RLE characters, as that does not require to add additional event handlers, which can be tricky to debug. Also, for simplicity I would just fix the direction for the placeholder texts according to the locale's script is (or ltr if we know that a placeholder comes from an untranslated source like taginfo). Yes, placeholders from missing translations might have quirky punctuation, but that's only an additional indication that someone should translate the respective string(s).

It wouldn't fix text alignment

The alignment can be fixed independently from the text "rtl/ltr" problem with css. In fact, I'm also tending towards fixing all "tabular" text to a single alignment: When the UI is in a rtl locale (e.g. Hebrew), all text is right-aligned, including raw OSM tags or names in English (and still using the correct respective text direction, of course). And vice versa for ltr locales (e.g. English). This helps with readability by keeping consecutive lines aligned horizontally and generally reduces visual noise. I think that makes the most sense from a typographic point of view.

I tried to find examples of how similar situations are handled in reality with mixed directional text (e.g. on road signs, advertisements, maps, etc.), but it seems like the problem is in practice almost always worked around by either justifying or centering all text, by adjusting font sizes to get an almost justified result, or by using completely independent columns for the different scripts. None of those methods work for dynamic and "tabular" data like in iD. Do you maybe have additional examples of how this is approached (well) in practice @NeatNit ?

tyrasd avatar Jun 03 '25 18:06 tyrasd

I'm slightly tending towards the recommended solution with the RLE characters, as that does not require to add additional event handlers, which can be tricky to debug.

As you are far more experienced than me in general web development and in iD specifically, I'll just accept this as fact, but I stand by my earlier opinion. That said, alignment should be the only visual difference between the two solutions.

Also, for simplicity I would just fix the direction for the placeholder texts according to the locale's script is (or ltr if we know that a placeholder comes from an untranslated source like taginfo).

That should work.

Yes, placeholders from missing translations might have quirky punctuation, but that's only an additional indication that someone should translate the respective string(s).

With the embedding solution (the Unicode formatting characters), if you use LTR embedding for untranslated strings and RTL embedding for translated strings, I think punctuation would be rendered correctly in all cases.

The alignment can be fixed independently from the text "rtl/ltr" problem with css.

You're giving me some silly ideas. I might share them soon. ;)

In fact, I'm also tending towards fixing all "tabular" text to a single alignment: When the UI is in a rtl locale (e.g. Hebrew), all text is right-aligned, including raw OSM tags or names in English (and still using the correct respective text direction, of course). And vice versa for ltr locales (e.g. English). This helps with readability by keeping consecutive lines aligned horizontally and generally reduces visual noise. I think that makes the most sense from a typographic point of view.

I tried to find examples of how similar situations are handled in reality with mixed directional text (e.g. on road signs, advertisements, maps, etc.), but it seems like the problem is in practice almost always worked around by either justifying or centering all text, by adjusting font sizes to get an almost justified result, or by using completely independent columns for the different scripts. None of those methods work for dynamic and "tabular" data like in iD. Do you maybe have additional examples of how this is approached (well) in practice @NeatNit ?

This is a tough question with no right answer, but when the column width is narrow enough, I personally prefer each cell to be aligned independently. For technical people like me it's also a good way to tell whether the string is resolved as LTR or RTL, which isn't always obvious from the rendered text alone. For example:

ABC אבג

אבג ABC

Without alignment, you wouldn't know there's a difference between these two by just looking at them.

I will look for examples, but I don't have any on hand. I'd guess in 90% of cases people don't make a conscious decision and just use whatever default behavior the software they're using gave them. I have definitely seen both options in the wild - that is, where an entire column is either left- or right-aligned, or where each cell is aligned independently based on its content.

Worth noting: Excel and LibreOffice Calc by default align each cell independently. I also feel like this is the more commonly used option elsewhere, but atm I don't have the data or examples to back it up.

NeatNit avatar Jun 03 '25 19:06 NeatNit

The alignment can be fixed independently from the text "rtl/ltr" problem with css.

You're giving me some silly ideas. I might share them soon. ;)

https://jsfiddle.net/bn7jfxth/

<input type="text" placeholder="&#x202B;עברית!&#x202E;" dir="auto">
<br>
<input type="text" placeholder="English!" dir="auto">
input:placeholder-shown[dir="auto"][placeholder^="\00202B"] {
  direction: rtl;
}

How's that for a compromise?

Edit: come to think of it, the embedding characters aren't needed. Whichever code would have added the embedding characters should instead add the class rtl-placeholder:

<input type="text" placeholder="עברית!" dir="auto" class="rtl-placeholder">
<br>
<input type="text" placeholder="English!" dir="auto">
input:placeholder-shown.rtl-placeholder[dir="auto"] {
  direction: rtl;
}

The code could use whatever heuristic to determine this.

NeatNit avatar Jun 03 '25 19:06 NeatNit

input:placeholder-shown.rtl-placeholder[dir="auto"] { direction: rtl; }

Oh, for some reason, I thought it's impossible to override the DOM attribute dir=auto with a specific direction in CSS. But it seems to work fine. So yeah, that does seem to be the most elegant solution. :+1: Thanks! :bow:

tyrasd avatar Jun 03 '25 19:06 tyrasd