Should vertical tab character be included in ASCII whitespace?
What is the issue with the Infra Standard?
This definition of ASCII whitespace does not include U+000B VT: https://infra.spec.whatwg.org/#ascii-whitespace
I looked on wikipedia which links to what looks like the unicode standard which calls VT whitespace.
I noticed this while implementing https://github.com/whatwg/dom/pull/1079 because VT is also considered whitespace in an old helper method in chromium.
I think most parts of the web platform do not consider VT whitespace. I don't recall where this was discussed though; maybe @annevk does.
We should definitely document this either way, similar to how we document U+000C FF. (https://github.com/whatwg/infra/pull/649)
I'm curious which parts of Chromium consider VT whitespace and which parts do not. Do you know? Based on the code search, it looks like some parts of Chromium do consider it, including parts which per spec should not.
I wonder if we have web platform tests for this... eventually we should, for all the places "ASCII whitespace" is used on the platform. (I guess for FF as well.)
But, I also don't want to make anyone shave this yak in a way that blocks https://github.com/whatwg/dom/pull/1079 :)
I think most parts of the web platform do not consider VT whitespace.
Unless for some reason we’re explicitly not considering JavaScript implementations, it seems worth noting here that the “White Space Code Points” definition in the ES spec includes U+000B VT.
Based on the code search, it looks like some parts of Chromium do consider it, including parts which per spec should not.
I know that’s definitely the case for WebKit as well.
As far as the general pattern, I think that unless some feature has a spec which explicitly requires ASCII whitespace, the implementations use a looser definition of whitespace that includes VT.
But as far as the parts-which-per-spec-should-not cases, I’ve run across plenty of those in implementations. I think in some (or many) of those cases, it may be because there aren’t actually any WPT tests for checking it.
https://github.com/WebKit/WebKit/pull/24217 is one sorta related example. In that case, it’s new code. But the reason it hasn’t landed is because there are no existing WPT tests for it, and I never got around to making time to write the tests myself.
But I also vaguely recall that some (or many) parts of CSS code also use a broader definition of whitespace — rather than the CSS post-preprocessing whitespace thing (which is functionally equivalent to ASCII whitespace) thing that they should be using per-spec. But I could be misremembering.
I fixed a bunch of this in WebKit some years ago and added better helpers. There's still some things to be fixed, but I'm not in favor of changing this definition at this point. If anything it's already too wide.
JavaScript is not a good place to borrow from as for some inexplicable reason they use Unicode's definition of White Space, which changes over time. That's absolutely not what we'd want.
U+000B was removed from the list of "space characters" in HTML in https://github.com/whatwg/html/commit/63e2aeb0b399b4740460388264a1b523ac6ac752
I didn't find a relevant email or bug or IRC discussion from June 2008, though.
I suspect this can't be changed at least for the HTML parser without causing XSS issues.
https://software.hixie.ch/utilities/js/live-dom-viewer/saved/13786
The biggest things still outstanding from the WebKit audit I did are https://github.com/w3c/csswg-drafts/issues/8757 and improving CSP test coverage as Mike mentioned above. (WebKit still does CSP incorrectly, but it's an easy fix.)
Thanks! It sounds like VT should not be included in whitespace. I'll incorporate this into the WPTs and implementation I write for https://github.com/whatwg/dom/pull/1079
I added the tracker to make sure we capture decisions/information like this for posterity (rather than depending on memory). Note that this is covered by I18N's best practices doc here...
I18N discussed this in our 2025-05-22 call and I was actioned with adding our thoughts to this issue. Basically, we agree that, while VT is a "sort of whitespace" character, it would be disruptive and a Bad Idea to introduce it as such in HTML's syntax. We may proceed to add HTML to our table of whitespace flavors in our best practices document to help cement the idea that nothing is wrong here.
Sounds good, closing this as not planned accordingly.
Reopening since we should add a note to Infra similar to #649.