html Proposal for Element.currentLang and Element.currentDir

I'd like to propose the introduction of two read-only properties on the Element object:

Element.currentLang
Element.currentDir

Element.currentLang and Element.currentDir are read-only properties that reflect the element's current language/direction as determined by their or their closest ancestor's lang and dir attributes, respectively.

The primary use case is to improve i18n in custom elements, but the benefit will also be seen by frameworks that currently use a separate, non-standard context to determine these values. Exposing the current inherited language and direction will provide better localization capabilities by removing performance hurdles and eliminating the need for additional logic and special contexts.

This information isn't currently available without expensive DOM traversal. Furthermore, selectors such as Element.closest('[lang]') will stop if they reach a shadow root, requiring recursive logic to break out of them:

// Recursive version of Element.closest() that breaks through shadow roots
function closest(selector, root = this) {
  function getNext(el, next = el && el.closest(selector)) {
    if (el === window || el === document || !el) {
      return null; 
    }
    
    return next ? next : getNext(el.getRootNode().host);
  };
      
  return getNext(root);
}

const lang = closest('[lang]', myEl).lang;

As a custom element author, it's not uncommon for users to have dozens of components on a page. It's also not impossible for a page to have multiple languages and directions. For components that require localization, the only way for them to inherit lang and dir is via DOM traversal or other non-standard logic. This, of course, isn't very efficient.

Being able to reference Element.currentLang and Element.currentDir will solve this in an elegant way using data the browser is likely already aware of.

Additional thoughts:

It seems pragmatic to expect lang and dir to pass through shadow roots. If desired, the custom element author can override it by applying lang or dir to the host element or to any element within the shadow root.
This proposal doesn't address a way to listen for language/direction changes. This would be incredibly useful, but probably out of scope for discussion within this group.
Interestingly, this is something that we can do with CSS via :lang and :dir (limited support). Unfortunately, there's no clean way to discover this value with JavaScript.

Sep 08 '21 12:09 claviska

Moving this to HTML as DOM doesn't define language or direction. We've had discussions about this kind of feature in the past. If someone could dig those up that would be helpful.

Sep 08 '21 12:09 annevk

element.matches(':lang(en)') works pretty well though, but I wonder if having those passing through SD would break expectations, specially in case people didn't test different dir of their SD elements.

On the other hand, this problem doesn't exist without SD or built-in extends, but enabling something to go through in SD might be the beginning of tons of other requests.

Sep 08 '21 17:09 WebReflection

element.matches(':lang(en)') works pretty well though

This works if you know the language(s) being used. If the language is arbitrary, there's really no mechanism to determine it without brute forcing it (silly) or DOM traversal (expensive).

On the other hand, this problem doesn't exist without SD or built-in extends

This problem is not exclusive to custom elements with shadow roots. You'd still need DOM traversal to reliably use the current language or direction of any element. A non-custom element use case may be a library or framework that handles localization and would prefer to use the platform-provided lang instead of a specialized context.

enabling something to go through in SD might be the beginning of tons of other requests.

I'd argue that localization is fairly unique and shouldn't be reset. At least, I can't think of a single use case where localization shouldn't persist until explicitly changed in the DOM tree.

Sep 08 '21 21:09 claviska

lang and dir are not coupled though, but I agree indeed there's no way to know these directives without some JS seppuku.

however, dir is usually language dependent, and if document.documentElement.lang is an empty string, we are in troubles, but otherwise it's relatively trivial to know if a well presented HTML page has a language preference.

mapping lang to a dir is not too trivial task, and if the browser knows how it should behave accordingly to either lang or user settings, it could be awesome to understand that, yet I believe any hook on the document would do, as anything else would likely violate the user lang/dir preferences (just trying to keep this proposal simple enough, and yes, it's useful for my daily use-cases too).

Sep 08 '21 21:09 WebReflection

It's worth noting that there can be multiple languages in an HTML document, so referencing document.documentElement.lang isn't a reliable solution. Some examples:

Displaying excerpts in other languages
Displaying quotations in other languages
Things like: <p>The word for "hello" in Spanish is <span lang="es">Hola</span></p>
<time datetime="2021-09-06 18:20" lang="es">[time formatted in es locale]</time>

It would be useful for libraries, utilities, and child components to be self-aware of the intended language and direction so they can render with the correct locales.

Sep 08 '21 22:09 claviska

fair enough, then I guess currentX proposal is needed desirable (got confused with the i18n language related API, your use cases are indeed relatively common).

Sep 09 '21 05:09 WebReflection

for direction, it seems you can at least do this:

getComputedStyle(element).direction

seems to work even in scenarios where elements are nested with different values, including being implicitly inherited from some ancestor. see here https://codepen.io/WickyNilliams/pen/QWqgXOQ

though I still think a dedicated property is useful.

I would say there should even be some way to observe changes to these values. If an element gets re-parented, or some (unknown) ancestor has its lang/dir values change, aside from polling (ugh) there would be no way to know and react to such a change

Dec 20 '21 15:12 WickyNilliams

getComputedStyle(element).direction

Good tip. I believe this will trigger a reflow, though, so a cached property would be preferred.

I would say there should even be some way to observe changes to these values.

I agree. Perhaps an event similar to languagechange would be helpful, but I don't want to bloat the initial proposal. It's also worth noting that lang and dir both reflect, so a mutation observer could be used to detect such changes in the interim.

Dec 20 '21 16:12 claviska

agreed, it is far from ideal, but a decent workaround for now.

can an MO cover all cases? what's the perf impact of observing the entire subtree from the document root? what happens if there are nested, intermediary shadow roots and the dir/lang is subject to change inside any of them? you'd have to climb up the tree and attach an observer at every root, as well as document? feels like there might be a ton of edge cases!

Dec 20 '21 16:12 WickyNilliams

can an MO cover all cases?

It won't pick up attribute changes in shadow roots, so no. Each component that's interested would need to attach a separate observer to its respective shadow root, which isn't ideal.

A composed event that bubbles up would be optimal, but that should probably be a separate proposal. But since we're here, perhaps dirchange and langchange would be reasonable candidates for event names.

Dec 20 '21 17:12 claviska

I think observability is a must - that's the trickiest part of this, and it's something browsers already implement in order to support :lang() and :dir() selectors.

I presented a couple of options in https://github.com/whatwg/html/issues/9918:

Option 1

Extend MutationObserver (or create a new LanguageObserver) to allow for observing currentLang. It feels like this should be an observer rather than an event since it's so closely linked to DOM changes.

Option 2

Provide a way to observe changes in CSS selector matching.

const result = element.matchSelector(`:lang(${element.currentLang})`);

result.addEventListener('change', () => {
  const newLang = element.currentLang;
  // …
});

This is based on window.matchMedia, but matches a selector.

Nov 08 '23 09:11 jakearchibald

interesting ideas.

extending mutation observer seems off to me, since MO is concerned with sub-trees, whereas lang/dir are the opposite (comes from above). is there precedent for an observer which works that way?

matchSelector feels like a broadly useful API, even outside of this use case - curious if that harms or helps the chances of getting this through?

Nov 08 '23 09:11 WickyNilliams

We could also split the difference between LangObserver/matchSelector and look at something like SelectorObserver.

Nov 08 '23 09:11 keithamus

@WickyNilliams

extending mutation observer seems off to me, since MO is concerned with sub-trees, whereas lang/dir are the opposite (comes from above).

It's only concerned with subtrees if you opt into that, otherwise it's just concerned with the element being observed. But you're right that none of the values it observes are computed.

is there precedent for an observer which works that way?

Intersection and resize observers observe computed values that are impacted by things all over the tree.

matchSelector feels like a broadly useful API, even outside of this use case - curious if that harms or helps the chances of getting this through?

I agree it would be generally useful, however it might not be the best fit for this use-case. The example I gave only observes one change - you'd need to un-observe and observe the new value each time. Not too tricky though.

The potentially trickier issue is timing. Something like matchSelector wouldn't signal its changes until style calculation, which feels wrong for something like lang, which relates to content semantics rather than style. But if that's the timing browsers use for updating <input> etc, fine.

Nov 08 '23 10:11 jakearchibald

Hmm, browsers don't seem to respect the element's language when it comes to <input>.

Nov 08 '23 10:11 jakearchibald

It's only concerned with subtrees if you opt into that, otherwise it's just concerned with the element being observed. But you're right that none of the values it observes are computed.

sorry yes, i used an overloaded term with sub-trees. i meant whether entries are derivations of the element/its descendants vs an element/its ancestors. i guess if there is an accompanying computed property on the element, then conceptually such an observer doesn't differ. makes sense re: IO/RO.

you'd need to un-observe and observe the new value each time.

hmm yes, that would be quite an awkward API.

Something like matchSelector wouldn't signal its changes until style calculation, which feels wrong for something like lang, which relates to content semantics rather than style.

might the timing issues cause any temporary inconsistent states? e.g. i'm thinking of a case where i change to hebrew as a lang and rtl as dir - could you end up with hebrew shown in a LTR layout, or the previous content in an RTL layout? either visually, or from the perspective of running code. i'm not familiar enough with browser internals to understand the implications

Nov 08 '23 10:11 WickyNilliams

might the timing issues cause any temporary inconsistent states? e.g. i'm thinking of a case where i change to hebrew as a lang and rtl as dir - could you end up with hebrew shown in a LTR layout, or the previous content in an RTL layout? either visually, or from the perspective of running code.

The browser would have to calculate styles in order to render, and that would trigger the observer, so I don't think that's a problem.

If a tab is "not visible" (therefore not generating frames, therefore not calculating style), running code (eg setInterval) could observe that currentLang has changed but the content hasn't.

You'd still get a bit of that with an observer, since it'd be offset by a microtask, but tying it to rendering seems confusing.

Observers don't need to be tied to microtasks fwiw. Mutations observers are tied to microtasks, whereas resize/intersection observers are tied to rendering.

It feels like this should be timed similar to mutation observers, since it's DOM mutations that cause lang to change.

Nov 08 '23 10:11 jakearchibald

Hmm, browsers don't seem to respect the element's language when it comes to <input>.

Kinda related, something we went around circles in was trying to understand the difference between:

lang attribute
navigator.language / navigator.languages
new Intl.NumberFormat().resolvedOptions().locale
Content-Language header of the document (not sure how it manifests in JS)

In the end what we want to know is when should we as authors use:

what the server says is the intended locale for the document (maybe not actually useful?)
what the user says in browser settings what their preferred locale is (let them input in preferred locale maybe...)
what the current operating system locale is (maybe not useful to know directly / should rely on user browser setting)
what the current element context calculates the locale is, i.e. current value of lang cascade (seems the most useful)

Nov 08 '23 11:11 rajsite

As the OP, I want to point out that I think @jakearchibald's proposal for matchSelector() is superior to my initial proposal in that it solves both getting the current language/dir and observing changes.

Consider this my vote for that as an alternative to the aforementioned properties. Additionally, the use cases for el.matchSelector() exceed more than just custom element localization.

To recap, from Jake's post above:

const result = element.matchSelector(`:lang(${element.computedLang})`);

result.addEventListener('change', () => {
  const newLang = element.computedLang;
  // …
});

Nov 08 '23 16:11 claviska

A complete solution using a hypothetical matchesSelector

let result = element.matchSelector(`:lang(${element.currentLang})`);

function handleChange() {
  const newLang = element.currentLang;
  result = element.matchSelector(`:lang(${newLang})`); 
  result.addEventListener("change", handleChange, { once: true }) 
} 

result.addEventListener('change', handleChange, { once: true });

It's quite awkward having to remember to cleanup and attach a new listener on every change imo. Of course this could be cleaned up a little, but it's just to demo it's not as easy as the snippet above

Nov 08 '23 18:11 WickyNilliams

I brought up the idea of observing a selector in whatwg/dom recently (which i guess has less visibility than here): https://github.com/whatwg/dom/issues/1225

My use-case is different, but it's nice to see that the idea was thought up from a completely different angle. As others have said, this would be generally useful to have.

Nov 08 '23 18:11 matthewp

Maybe language / locale could be made a CSS property (if it makes sense for direction then why not?) and we can have ComputedStyleObserver handle those and more.

Nov 08 '23 18:11 rajsite

@rajsite Direction relates to layout and language does not. If you're interested in pushing for lang in CSS, file an issue with the CSSWG, but you'll need better reasoning than "why not" 😄

Nov 09 '23 09:11 jakearchibald

I don't think my "option 2" is viable due to style calculation timing. Language changes in response to tree/attribute changes, so observation should be immediate (like most change events) or off by a microtask (like mutation observers).

Also, option 2 has some unfortunate DX issues, because matchSelector will only tell you about a change to and from a particular language, when you actually want to hear about any change in language. You'll end up having to wrap it in something like:

function observeLanguage(
  element,
  { onChange, signal = new AbortController().signal },
) {
  if (signal.aborted) return;

  const result = element.matchSelector(`:lang(${element.currentLang})`);

  result.addEventListener(
    "change",
    () => {
      onChange();
      observeLanguage(element, { onChange, signal });
    },
    { once: true, signal },
  );
}

I think the better options are an event, or, if an event is bad for the same reason mutation events are bad, some kind of LanguageObserver.

Nov 09 '23 10:11 jakearchibald

agreed. there are compelling uses cases for a matchSelector type API, but it feels like squeezing a square peg through a round hole to use it here. i'd rather something purpose built that takes the complexities of writing direction and language into consideration with regards to timing and language being a stream of values over time (rather than one-shot with matchSelector). A dedicated observer or event is fine by me

Nov 09 '23 10:11 WickyNilliams

Another difficult to track ancestor influenced state that would be useful to observe for changes would be isContentEditable based on contenteditable configuration.

Recently ran into an issue where we would like to have that propagate into the shadow root of a custom element which is currently blocked from propagating on its own. So we are trying to investigate ways to observe the state and reflect it in the shadowroot manually.

Nov 29 '23 00:11 rajsite

@rajsite that feels different to this issue. Can you file a new issue for your request?

Nov 30 '23 09:11 jakearchibald

Now that the :dir pseudo-class has pretty decent support, it's at least easy to get the resolved dir of the current element via matches, which i imagine is cheaper than my previous approach of using getComputedStyle

const isLTR = someElement.matches(":dir(ltr)")

Still, being able to observe this would be nice.

Mar 26 '24 13:03 WickyNilliams

We discussed this today at TPAC in a breakout with @dbaron @jyasskin @fantasai (who also asked @annevk @r12a).

Originally, the sentiment for computedLang/computedDir was "seems reasonable, should be trivial to implement since the browser already tracks this, all that’s needed is patches for the spec and UAs".

However, after thinking about it some more, some folks were worried that once computedLang becomes a thing, authors are going to try and do naive language parsing like el.computedLang === "en" || el.computedLang.startsWith("en-"). Language parsing is full of complicated edge cases and authors should not be rolling their own, so people felt this could be a footgun.

Ideas discussed were:

Decoupling: el.computedLang returns a string, and can ship earlier, but there is a separate class or utility method whose constructor accepts a language string and parses it into its components (which are defined by Unicode).
- Pro: el.computedLang can ship earlier and is not blocked on the more complicated language parsing API
- Pro: For performance sensitive use cases, the object creation can be deferred until actually needed, rather than having every lookup result in object creation
- Con: The footgun of having a string that authors use string manipulation on is still there
- Pro: The new class can be used to parse languages more generally, without having to set them on a dummy element first
- Con: Unclear how to monitor changes, we’d need another API for that
- Pro: Consistent with element.lang / element.dir
el.computedLang returns an object with the language components parsed (as well as the whole string). Has a toString() method that returns the whole string, so that it can still be used as a string. Or it could have a matches() method to facilitate comparison.
- Pro: No footgun, doing the right thing is as easy as doing the wrong thing
- Con: The string version cannot ship independently and is blocked on the more complicated API (unless we just ship an object with a single property and add more properties later)
- Con: More complicated to spec, will take longer
- Pro: Extensibility. We can later add methods/properties that facilitate observing changes or any other utility. Or the object itself could even be an EventTarget that fires change events.
- Con: Inconsistent with element.lang / element.dir

During the discussion I was of the opinion we should do 1, but now I’m leaning towards 2, potentially with a very bare-bones object at first. We could later make such objects constructible to address use cases not connected to the DOM (e.g. navigator.language).

Sep 26 '24 21:09 LeaVerou

URL exposing APIs follow option 1

Sep 26 '24 21:09 jakearchibald

html html copied to clipboard

Proposal for Element.currentLang and Element.currentDir

Option 1

Option 2

html
html copied to clipboard