yari icon indicating copy to clipboard operation
yari copied to clipboard

I want always see the page in English

Open angel-luis opened this issue 4 years ago • 18 comments

Probably I've pressed the Languages > English button about 1.000 times.

I'm from Spain, and always that I access via Google I get the old and semi-translated Spanish page.

So please, add a semi-permanent cookie in the browser or a permanent cookie in the user session to setup always the preferred language.

Thank you.

angel-luis avatar Sep 20 '19 18:09 angel-luis

By the way, I think we do set cookies. I wasn't sure how to test this so I started a clean profile with "Microsoft Edge Dev" and changed it to only have "Swedish" Screen Shot 2019-09-20 at 4 33 29 PM

Then I go to https://developer.mozilla.org and it redirects to https://developer.mozilla.org/sv-SE/ From there, I click on some document like: https://developer.mozilla.org/sv-SE/docs/Web/HTML And when I'm there I click the language drop-down to select English: Screen Shot 2019-09-20 at 4 34 27 PM

It switches to English and I press Yes on the dialog that appears: Screen Shot 2019-09-20 at 4 35 23 PM

If I then completely quit the browser and start it up again and type in https://developer.mozilla.org it now redirects to https://developer.mozilla.org/en-US/. So it does work. The cookie remembers.

I suspect you have cookies disabled. I can simulate that in "Microsoft Edge Dev" here: Screen Shot 2019-09-20 at 4 37 38 PM

Now, if I quit the browser and start it up again and go to https://developer.mozilla.org the cookie preference is gone so it falls back on reading the browser language preference and so it goes back to redirecting to https://developer.mozilla.org/sv-SE/

@angel-luis I'm pretty sure this means that cookies either aren't allowed at all or you have, like in my simulation, cookies cleared when you quit the browser.

Makes me wonder if perhaps we can use localStorage instead. Or both. Obviously, localStorage will never work on the server. So if the request comes without a cookie, but the browser has a non-en-US language preference, you will end up on your locale page. Perhaps, on that page we can do something like this:

// Somewhere on the landing page client side JS code
const preferredLang = localStorage.getItem('preferredLanguage');
if (preferredLang !== getCurrentLocationLocale()) {
    window.pushState(...trigger a redirect to "preferredLang"...);
}

What do you think?

peterbe avatar Sep 20 '19 20:09 peterbe

Hi,

Thank you for your time and your fast response.

I've make a test and yes, when I go to https://developer.mozilla.org/ by default it's in English and I can see how my cookie is enabled: django_language: en-US.

The problem is when I make a search from Google. For example, I search Math Round, and I get this link: https://developer.mozilla.org/es/docs/Web/JavaScript/Referencia/Objetos_globales/Math/round

So the /es/ path is overwriting my current language preference. The most of the time I access MDN searching in Google, so this leads to the main problem.

Could I force somehow to the page always request the /us-US/ path even if I'm requesting the /es/ path?

Thank you.

angel-luis avatar Sep 20 '19 21:09 angel-luis

Ah yes. This is a recurring problem for MDN. Google prefers the localized versions. I wish there was a way to say, to Google, "For developer.mozilla.org please prefer the en-US URLs. Thanks".

Could I force somehow to the page always request the /us-US/ path even if I'm requesting the /es/ path?

I believe there's a web extension for that. https://addons.mozilla.org/en-US/firefox/addon/mdn-language-redirector/?src=search

If enough people, like you, keep having to install this extension perhaps we should solve it centrally once and for all.

peterbe avatar Sep 20 '19 21:09 peterbe

Argh, I've installed the plugin and it redirects to the /en-US path but the rest of the path is translated in Spanish and I get a 404.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Referencia/Objetos_globales/Math/round

angel-luis avatar Sep 20 '19 21:09 angel-luis

That web extension needs some work then :( Instead of just replacing "/es/" with "/en-US/" it should go to https://developer.mozilla.org/es/docs/Web/JavaScript/Referencia/Objetos_globales/Math/round$json and read the url from the right .translations array.

I don't know who wrote that extension but perhaps you can file an issue and contribute and reference this comment:

peterbe avatar Sep 20 '19 21:09 peterbe

I think this is an open bug and it ought to be fixed because it can be fixed. We can't control Google but we can control our own cookies. Although I don't yet have a complete picture of everything, we do have a working solution based on cookies that work really well when you're requesting a page without a locale prefix in the URI. However, it seems those preference redirects don't happen when you go to a page with a locale prefix.

To add a variable to the mix, most people don't want to "mess" with the localization you're given. (This is probably due to so many users in North America) who are happy to stick to en-US. It, therefore, makes sense to apply "aggressive HTTP caching" at the CDN level. For URLs with a locale prefix we can tell the CDN to cache it as a static response for a long time which means we don't get a chance to evaluate your cookies. However, we do have cookies and localStorage/sessionStorage as an option after the initial load.

We could use this to get the best of both worlds. 1) Aggressive caching in the CDN 2) redirect people to the locale they prefer. And that's why I think this issue should remain open.

I'd love to hear out @escattone on this.

peterbe avatar Sep 23 '19 13:09 peterbe

@peterbe Thanks. It would be great to be able to never have to switch to English again.

To be honest I don't even think it's a good idea to have NL pages. Worse than having no information is having the wrong information.

As @angel-luis said, translated pages are frequently out of date or incomplete. Even if just 1% of the pages were like that, it would still make all translated pages worthless, because you can never know whether the information you're reading is accurate and complete.

Essentially, translated pages should be automatically unpublished and marked for review whenever the content in the primary language is edited.

I can't speak for other languages, but I doubt there are any Dutch web developers - at all - who don't understand English. Hence my statement that I doubt it's a good idea to have NL pages.

WvanDam avatar Sep 24 '19 10:09 WvanDam

Thank you for your vocal support @WvanDam ! It's definitely inspiring to hear and it gives us fuel to find the time to work on it.

Now, the idea of marking translated pages as officially out-of-date is, to me, sounding great. I'm still relatively new to the core team and not sure what this all means or if it's been done before. Or if it's even possible (which I doubt).

Another crazy idea is to stop encouraging Google to favor the translated pages. We could simply put a "canonical" link meta tag that points to the en-US version. I.e. on /nl/docs/Foo/Bar we have a meta tag <link rel="canonical" href="/en-US/docs/Foo/Bar">. That would mean that Google would stop indexing /nl/docs/Foo/Bar in favor of /en-US/docs/Foo/Bar. Users can still use their cookies and stuff to get nl if they prefer that.

Perhaps we should get back to the topic at hand which is using some form of state (cookies, localStorage) to escape the non-en-US page if you have arrived on it.

peterbe avatar Sep 24 '19 13:09 peterbe

After my previous comment I actually noticed a notification at the top of the translated page I was directed to via Google to inform me that the translated page was out of date. This could have been added manually, but perhaps it's done automatically.

But yes, leveraging local storage for redirecting a visitor to their previously indicated preferred language regardless of the point of entry is a solid solution to the main issue at hand.

WvanDam avatar Sep 24 '19 13:09 WvanDam

Options:

  • could be handled by user profile that specifies localization (assuming logged in)
  • patch web FF extension
  • non-signed in cookie that defines localization pref

@atopal what is your preference here?

tobinmori avatar Sep 30 '19 16:09 tobinmori

Thanks for the consideration. Options 1 and 2 only work for a subset of your audience, so I'd say item 3 is the preferred solution.

WvanDam avatar Sep 30 '19 16:09 WvanDam

We'll probably go with option 1 as it's the most robust one. People delete cookies, change browsers, use private mode all the time.

atopal avatar Oct 01 '19 08:10 atopal

@atopal Doesn't that assume users are logged in on all devices, in all browsers? Including private mode? I can tell you I'm definitely not.

I understand that option 1 would be your preferred option, but in any case I suggest using a simple cookie/localstorage as fallback for those who don't want to log in each time just to fix the language issue.

WvanDam avatar Oct 01 '19 09:10 WvanDam

Options:

  • could be handled by user profile that specifies localization (assuming logged in)
  • patch web FF extension
  • non-signed in cookie that defines localization pref

Another option that respects the in-browser setting and doesn't require login, extension or cookie:

  • use the Accept-Language HTTP header to set the default language

apapsch avatar Dec 23 '20 11:12 apapsch

@apapsch We use the Accept-Language header when you request a URL that doesn't explicitly specify which locale it wants. E.g.

▶ curl -I -H 'Accept-language:ja' https://developer.mozilla.org/docs/Web/CSS
HTTP/2 302
content-length: 0
server: CloudFront
date: Fri, 08 Jan 2021 15:59:19 GMT
location: /ja/docs/Web/CSS

But the problem in this issue is that people stumble into a URL whose defined locale (right there in the URL) isn't what they wanted. E.g. you have a Spanish friend who sends you a "Hey check out this URL" (and they send their Spanish locale URL)

peterbe avatar Jan 08 '21 16:01 peterbe

Thanks for the clarification! Rereading the thread, I'd say having Google index only English pages would be the single most effective measure to have users access the canonical content rather than the translated, often-outdated content.

On a side note, why is MDN encoding locale in URL path at all? This seems to be the source of the problem in the first place. I see locale as part of state. Ultimate client state is cookie. It would be set from Accept-Language, profile setting and query parameter. Locale in URL path on the other hand seems to be redundant information.

apapsch avatar Jan 08 '21 17:01 apapsch

The reason the locale is in the URL, and not dynamic, is because MDN isn't dynamic. (Some parts are, but not document articles). The content is served as static files from a CDN. We could change that so all URLs have no locale in them. But then the CDN would have to hash on the Accept-language header every time, so many more cold cache hits from the CDN. And if you're using a browser that's German, but you prefer English, you'd have to rely on cookies. That makes the CDN hashing much harder to get warm hits.

What's interesting is that we submit both (or rather; all) translation URLs to Google's indexing. And we use the hreflag meta tags to tell Google the whole story. So they have both the Spanish and the English URL in their index. But more often than not, they prefer to list the Spanish version higher than the English version (*) for people searching in a Spanish speaking origin. There might be ways to bend that too so we force Google to prefer the English version. But how do you make a decision like that. In some countries, like the Netherlands where almost everyone speaks perfect English, it's one story. But in China, it's a different story. So our "escape hatch" for now is to move the control and the power over to users. But it's definitely not a solved problem yet.

One possible solution is to use cookies to "override" what was served to you, after it loaded from the CDN. I.e. suppose you've set that you prefer English but you still open https://developer.mozilla.org/es/docs/Web/CSS (presumably because you clicked a link (e.g. a Google search result)). We could forcibly pop-up and say "Hey! You're on a Spanish page but you've said you wanted English. Click here to go to the English version".

(*) E.g. you search for a technical term like "IntersectionObserver" which'll be in the title in an English and a Spanish page.

peterbe avatar Jan 08 '21 17:01 peterbe

There is also an extension for Chrome/Edge: https://chrome.google.com/webstore/detail/mdn-language-redirector/phkkdccpgglghcdikkcalajiigdccnbo/related

fgeierst avatar Mar 16 '22 12:03 fgeierst