markup-validator
markup-validator copied to clipboard
`check/referer` gives confusing results due to more default restrictive referer policies in browsers
eg, following <a href="http://validator.w3.org/check?uri=referer"> from http://users.ugent.be/~jaschmid/ seems to validate http://users.ugent.be/ (wrongly); while <a href="http://validator.w3.org/check?uri=users.ugent.be/~jaschmid/"> works well.
Reported by @jamesrschmidt
Any update on this?
We are using it for education and the student server uses ~stud21 to host the student websites.
I think the problem is not linked to ~, but to the Referer header being stripped due to Referer policies (including a default of not keeping referer from https to http). @mosbth, can you point to specific current instances where this is happening?
I guess this try-out-page can provide the details needed? http://www.student.bth.se/~mosstud/test/referer.php
Click on "Click to check referer" to "self-submit" and check the referer of the site/page.
At the bottom you may try referer links to the validators. Click to see that the validators strip off the path segment of the url when they look up the referer.
It seems like Unicorn throws an exception, not sure if its related or not.
class java.io.IOException
Server returned HTTP response code: 503 for URL: http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.student.bth.se%2F&output=ucn
java.base/jdk.internal.reflect.GeneratedConstructorAccessor1052.newInstance(Unknown Source)
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1974)
java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1969)
java.base/java.security.AccessController.doPrivileged(Native Method)
java.base/sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1968)
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)
org.w3c.unicorn.request.URIRequest.doRequest(URIRequest.java:141)
org.w3c.unicorn.RequestThread.run(RequestThread.java:66)
so the problem is indeed not with the validator itself, but with recent changes in how the Referer header is sent by browsers, which are now defaulting to strict-origin-when-cross-origin - more discussion on the underlying change e.g. in https://developers.google.com/web/updates/2020/07/referrer-policy-new-chrome-default
I guess this makes the check/referer route much less useful in general for validators. I'll re-title the issue - I'm not sure yet if the referer route should just be deprecated, or if visistors coming that way should be warned of its limitation.
A)
https://developers.google.com/web/updates/2020/07/referrer-policy-new-chrome-default
Does this mean that I can change my Apache config to send the header no-referrer-when-downgrade to get the referer to work again in the validators?
That would perhaps be a workaround for me, in the short term.
B) In the long run I could calculate/extract the current url and send it as an argument through the querystring. I think that some validators supported that a few years back, before i started using the referer url. Just let me know if thats the way to go from here.
yes, I believe setting Referrer-Policy: no-referrer-when-downgrade (or unsafe-url which I think is equivalent when used on non-HTTPs origin) would be a workaround (either as an http header, or via a meta tag in the HTML).
And indeed, all the W3C validators can be called with a uri query string parameter to check the said URL.
Fine, it seems like the <meta name="referrer" content="unsafe-url"> is doing the trick.
I did a small testprogram here: http://www.student.bth.se/~mosstud/test/referer-with-meta.php
That is an adequate way for us to proceed (in the short (perhaps long) term) to be able to use the validators.
Otherwise I will try out the querystring way.
Many thanks!
This also solves my need in https://github.com/w3c/css-validator/issues/191