GREASE is not well thought and will be circumvented due to necessity
GREASE itself seems like a good idea, however I strongly believe that it will be quickly and swiftly circumvented due to necessity.
Imagine case where I would like to show user list of active sessions, with information on what kind of browser they're using. I can realistically show Chrome 86 / Chromium 86, but I can not show user '"Not\A;Brand / Chrome 86 / Chromium 86. This means that either of these two cases must take place:
- List of known good brand names is created and curated; others are shown as
Unknown- this will make it harder for users to identify their sessions for less known browsers. - List of known GREASE values is curated, or an algorithm is devised to catch-all potential GREASE values. A very dumb example might be to simply assume all brand names with either
\,;,", or'in them are GREASE.
Given these two scenarios, it seems obvious to me that GREASE does not achieve what it sets out to do.
The purpose of GREASE is sort of two-fold. First is to encourage clients to use a fleshed parser for the header instead of the half-baked/fragile parsers you see for the User Agent string now. This doesn't seem to be a problem in you scenarios, but is worth pointing out why the weird characters exist in the fake brand.
The second is to discourage whitelisting (blocking or downgrading "unknown" user agent brands), which it goes for in two ways:
-
Giving a common project name along with the most specific name (e.g. Chromium w/ Edge, Brave, Vivald, etc.) to let websites be as generic as possible (instead of knowing all of the Chromium derivatives, it can just look for "Chromium" and roll from there). This does require website owners to be a little bit more understanding, but I think the trade off of more traffic available is worth it.
-
It's not explicitly stated that those characters have to be there, that's just the example implementation. There are other strategies, such as adding other established brands or making up a believable looking brand.
It's number two that we're hoping will stop the circumvention you're referring to.
It's also the hope that these behaviours (ossifying lists of brands and databases of header values maps) will go away because they'll be unnecessary. In your analytics scenario, most analytics platforms either have an "other" already or simply drop unknown UA strings, which is bad for small browsers. With the proposed solution you can say
- chromium ##
- chrome ##
- edge ##
- new kid on the block ##
- fake browser ##
And I think customers will be able to tell what's up
I did not present an analytics scenario; I have presented a scenario in which website (for example facebook) shows user list of their logged-in sessions so they can log them out:

And unless one of the scenarios I proposed are implemented by such site, it's gonna look like this:

I think this is gonna be a hit against smaller sites. It's obvious that sites like facebook will just implement whitelist of recognized brand names and keep it updated, but smaller sites do this too and they may not have enough manpower to update it continuously.
Secondly, I fail to see how GREASE can stop your described behavior of blocking of browsers. Realistically, I can still look for /chrome?(?:ium)?/i and serve worse or broken content to browsers that don't pass this, or show a message encouraging to update browser to Edge if /edge/i is not found. So, please explain - how does this solve anything?
There are other strategies, such as adding other established brands or making up a believable looking brand.
How is that different from Mozilla, (KHTML, like Gecko), Chrome, Safari? Browser will still be forced to lie as for their brand, by saying Chrome / aCuteBrowser / Chromium / Firefox / '"Not\A;Brand.
Also, if browser advertises itself as Firefox 82 / Chromium 86 / Chrome 86 / '"Not\A;Brand, how are we supposed to know which browser is real (and why?) and if we aren't, then what is the purpose of even including this header. Realistically, even on manual inspection it's impossible to say which browser it is, unless you also require full version header and compare it to some mapping list - at which point GREASE is again defeated. And while browser can technically refuse to send full version header to website, a website might in turn assume bad intentions and just serve broken content or outright refuse to serve any content (analogous to anti-adblock)
It's not explicitly stated that those characters have to be there
But it is, right here:
When adding arbitrary values to brands, user agents MUST make sure that receivers of the header adhere to Structured Header parsing, by adding escaped double-quotes, commas and semi-colons to those values. The purpose of this is to make non-compliant server implementations immediately aware that their parsing code is inadequate.
FWIW, on the analytics thing, I opened an issue with related thinking (#115) and now a PR (#197).
First is to encourage clients to use a fleshed parser for the header instead of the half-baked/fragile parsers you see for the User Agent string now.
So we are making it more complicated to parse, so that developers use a more complex implementation?? I thought the point of splitting the values into their own headers was to make it easier to parse? This as mentioned above now requires include/disallow lists or weird brand detectors.
So we are making it more complicated to parse, so that developers use a more complex implementation??
Less "more complex", and more "less lazy".
The only way to stop people making the simplest implementation, which assumes a simple format that never changes, is to change the format regularly, and force people to code for that. Hence GREASE.
(Not a browser dev, BTW).
I thought the point of splitting the values into their own headers was to make it easier to parse?
Yes, this is true. And to encourage better parsing that is less failure prone.
@hexydec have you run into any particular issue here? Or just dislike the design?
The issue is that you don't know what the grease value will be, how it is structured, so I can make assumptions about the format that I can exclude, or I can whitelist all the values that I think are valid.
This all complicates the handling of the value.
I also thought that it is supposed to be a simple format, with a defined structure, that is easy to parse. Why complicate it?
I am coming at this from the pov of web statistics.
how it is structured, so I can make assumptions about the format that I can exclude, or I can whitelist all the values that I think are valid.
it's a pretty well-defined algorithm, see https://wicg.github.io/ua-client-hints/#create-arbitrary-brands-section
Ok thanks for sending the link to the format, that is useful.
Grease is not something I have come across before, it's a bit of a backwards way of forcing developers not to be lazy, but I get it.