almanac.httparchive.org icon indicating copy to clipboard operation
almanac.httparchive.org copied to clipboard

Translate content to Dutch

Open tunetheweb opened this issue 3 years ago • 42 comments

These are the core templates - without which we cannot release any translated chapters. They are in the language specific templates directory:

  • [x] base.html - @noah-vdv
  • [x] base_chapter.html- @noah-vdv
  • [x] error.html- @noah-vdv

2022

  • [x] base.html -
  • [x] contributors.html -
  • [x] index.html -
  • [x] table_of_contents.html -

2021

  • [x] base.html -
  • [ ] base.html - foreword
  • [x] contributors.html -
  • [x] index.html -
  • [x] table_of_contents.html -

2020

  • [x] base.html - @noah-vdv
  • [x] contributors.html - @noah-vdv
  • [x] index.html - @noah-vdv
  • [x] table_of_contents.html - @noah-vdv

2019

  • [x] base.html - @noah-vdv
  • [x] contributors.html - @noah-vdv
  • [x] index.html - @noah-vdv
  • [x] table_of_contents.html - @noah-vdv

These are the chapters to be translated, in rough order of popularity. They exist in the content directory:

2022

  • [ ] css -
  • [ ] javascript -
  • [ ] markup -
  • [ ] structured-data -
  • [ ] fonts -
  • [ ] media -
  • [ ] webassembly -
  • [ ] third-parties -
  • [ ] interoperability -
  • [ ] seo -
  • [ ] accessibility -
  • [ ] performance -
  • [ ] privacy -
  • [ ] security -
  • [ ] mobile-web -
  • [ ] capabilities -
  • [ ] pwa -
  • [ ] cms -
  • [ ] jamstack -
  • [ ] sustainability -
  • [ ] page-weight -
  • [ ] cdn -
  • [ ] http -

2021

  • [ ] css -
  • [ ] javascript -
  • [ ] markup -
  • [ ] structured-data -
  • [ ] media -
  • [ ] webassembly -
  • [ ] third-parties -
  • [ ] seo -
  • [ ] accessibility -
  • [ ] performance -
  • [x] privacy - @VictorLeP
  • [ ] security -
  • [ ] mobile-web -
  • [ ] capabilities -
  • [ ] pwa -
  • [ ] cms -
  • [ ] ecommerce -
  • [ ] jamstack -
  • [ ] page-weight -
  • [ ] compression -
  • [ ] caching -
  • [ ] cdn -
  • [ ] resource-hints -
  • [ ] http -

2020

  • [ ] css -
  • [x] javascript - @noah-vdv
  • [ ] seo - @noah-vdv
  • [x] markup - @noah-vdv
  • [x] accessibility - @noah-vdv
  • [x] cms - @noah-vdv
  • [x] performance - @noah-vdv
  • [x] security - @noah-vdv
  • [x] jamstack - @noah-vdv
  • [x] fonts - @noah-vdv
  • [x] mobile-web - @noah-vdv
  • [x] pwa - @noah-vdv
  • [x] http2 - @noah-vdv
  • [x] capabilities - @noah-vdv
  • [x] third-parties - @noah-vdv
  • [x] page-weight - @noah-vdv
  • [x] caching - @noah-vdv
  • [x] resource-hints - @noah-vdv
  • [x] privacy - @noah-vdv
  • [x] compression - @noah-vdv
  • [x] ecommerce - @noah-vdv
  • [x] media - @noah-vdv

2019

  • [ ] javascript -
  • [ ] css -
  • [ ] seo -
  • [ ] performance -
  • [ ] http2 -
  • [ ] markup -
  • [ ] third-parties -
  • [ ] fonts -
  • [x] accessibility - @noah-vdv
  • [ ] pwa -
  • [ ] cms -
  • [ ] security -
  • [ ] media -
  • [ ] mobile-web -
  • [ ] cdn -
  • [ ] ecommerce -
  • [ ] page-weight -
  • [ ] caching -
  • [ ] compression -
  • [ ] resource-hints -

Additionally the following pages need translated too in the language specific templates directory:

  • [ ] 2021/methodology.html -
  • [x] 2019/methodology.html - @noah-vdv
  • [x] 2020/methodology.html - @noah-vdv
  • [x] accessibility_statement.html - @noah-vdv
  • [x] accessibility_statement.html additional update - @VictorLeP #2596
  • [x] base_ebook.htm - @noah-vdv

There is no need to translate the chapters HTML pages as they are generated off the markdown combined with the above templates.

Please include "Makes progress on #1750" in all pull requests so a link is created from the PR to this issue.


Common notes for writing consistency are here: https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Translators'-Guide. Feel free to edit that and/or add Dutch-specific extras by editing this comment.

Dutch specific extra advice:

tunetheweb avatar Dec 11 '20 17:12 tunetheweb

So the html files don't need to be translated right? Just the files under 2020, 2019 and additional files?

strangernr7 avatar Dec 11 '20 19:12 strangernr7

The core html files DO need to be translated.

The chapter HTMLs not shown above (e.g. performance.html) do not need to be translated as they are generated from the markdown (e.g. performance.md). We used to store these generated chapter HTML files in git, but no longer do so probably can remove the statement related to that as probably more confusing than useful. Then again if you run the site locally (see instructions in src/README.md) then it will generate these files but they can be ignored from translation point of view.

That make sense, or have a just confused you more than helped? 😀

tunetheweb avatar Dec 11 '20 19:12 tunetheweb

alright, nah that helped thanks

strangernr7 avatar Dec 11 '20 19:12 strangernr7

@noah-vdv thanks for your help translating this content! I see the base templates have been checked off. Have those translations been submitted yet?

rviscomi avatar Dec 15 '20 00:12 rviscomi

@noah-vdv thanks for your help translating this content! I see the base templates have been checked off. Have those translations been submitted yet?

No problem. I have checked them off indeed but realized that I might have translated bits that didn't need translating and totally forgot to check if some pieces needed the <lang="en"> tags. So I'll double-check those html files and get a PR out on Thursday hopefully.

strangernr7 avatar Dec 15 '20 06:12 strangernr7

Feel free to open a draft PR for now and I can give feedback.

To @rviscomi ’s point we normally put your name beside something you’re working on (so others don’t work on it too) but only tick it off after PR is accepted to show its “done”. I’ve updated the first comment to that now.

tunetheweb avatar Dec 15 '20 06:12 tunetheweb

BTW since you've done the base templates we're ready to send this language live so (at the risk of making this PR even bigger!) can you add Dutch to server/language.py (lines 45-54) and also add nl to supported_languages in config/2019.json and config/2029.json?

tunetheweb avatar Dec 15 '20 06:12 tunetheweb

Should I translate content such as First Input Delay or keep it as is?

strangernr7 avatar Dec 17 '20 13:12 strangernr7

Should I translate content such as First Input Delay or keep it as is?

I would say not as it's a technical term. Not quite code, but close enough. Maybe add the Dutch translation after first time it's used.

tunetheweb avatar Dec 17 '20 13:12 tunetheweb

What has changed in the 2020 base.html file for it to be translated?

strangernr7 avatar Jan 15 '21 18:01 strangernr7

Oh. Looks like you already got the foreword. Most languages didn't. Will remove that one!

Did you see there were some changes to Markup chapter?

tunetheweb avatar Jan 15 '21 18:01 tunetheweb

Oh. Looks like you already got the foreword. Most languages didn't. Will remove that one!

Did you see there were some changes to Markup chapter?

Alright, yeah I'll get to those before I do the seo chapter

strangernr7 avatar Jan 15 '21 18:01 strangernr7

I was going through the 2020 seo chapter and realized there's still 3 TODO's for authors at around line 500

strangernr7 avatar Jan 21 '21 13:01 strangernr7

And in some other chapters as well so I'll just do those later

strangernr7 avatar Jan 21 '21 14:01 strangernr7

I was going through the 2020 seo chapter and realized there's still 3 TODO's for authors at around line 500

Good spot. I've reviewed them and happy to just remove them. I count 4 though not three (though one is not marked as a TODO but a Note):

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L504

Seems pretty obvious to me so don't think it needs further comment. Maybe it was added since that comment was added.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L517

Yeah would have been nice to have some interpretation rather than just the stats, but not only chapter to do this, so let's leave for now and remove the TODO.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L531

Rick edited this and I trust his understanding of Lighthouse so to me this is just a comment for Authors during review so TODO can be removed.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L779

Same as above so TODO can be removed.

Would you mind removing them from English version too as part of this PR?

And in some other chapters as well so I'll just do those later

I'll try to review those similarly and submit a PR for them. I see Compression, JavaScript, Privacy (currently under re-review anyway), Resource-Hints (this TODO can be removed), Security (not edited yet anyway), and SEO (discussed above). Hold off translating them for now but SEO and Resource Hints can be done.

tunetheweb avatar Jan 21 '21 14:01 tunetheweb

I was going through the 2020 seo chapter and realized there's still 3 TODO's for authors at around line 500

Good spot. I've reviewed them and happy to just remove them. I count 4 though not three (though one is not marked as a TODO but a Note):

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L504

Seems pretty obvious to me so don't think it needs further comment. Maybe it was added since that comment was added.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L517

Yeah would have been nice to have some interpretation rather than just the stats, but not only chapter to do this, so let's leave for now and remove the TODO.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L531

Rick edited this and I trust his understanding of Lighthouse so to me this is just a comment for Authors during review so TODO can be removed.

https://github.com/HTTPArchive/almanac.httparchive.org/blob/997997be970f7b1d546702366891cff941baa7b1/src/content/en/2020/seo.md#L779

Same as above so TODO can be removed.

Would you mind removing them from English version too as part of this PR?

And in some other chapters as well so I'll just do those later

I'll try to review those similarly and submit a PR for them. I see Compression, JavaScript, Privacy (currently under re-review anyway), Resource-Hints (this TODO can be removed), Security (not edited yet anyway), and SEO (discussed above). Hold off translating them for now but SEO and Resource Hints can be done.

Alright, so I'll just remove the 4 in seo and the one in res hints right?

strangernr7 avatar Jan 21 '21 14:01 strangernr7

Yeah if translating any of those chapters. Probably a bit confusing if you remove them but not translating that chapter so if you've moved on to another chapter instead then I'll take care of them in next day or two. Let me know what chapter you plan to do next so I know which to remove.

tunetheweb avatar Jan 21 '21 14:01 tunetheweb

I've removed them from the english version in another branch so I can just make a PR with that

strangernr7 avatar Jan 21 '21 14:01 strangernr7

Should we also translate the README (src/README), CONTRIBUTING and CODE_OF_CONDUCT md files? So if people check out this repo they can read it in other languages as well.

strangernr7 avatar Feb 09 '21 12:02 strangernr7

I say no. The project is run on English so you need to understand English (or have a translator to help you) if you want to be involved in the project on GitHub.

However the output of the project (the website) is translated to make them as available as possible.

tunetheweb avatar Feb 09 '21 13:02 tunetheweb

Should github.com links be added to the list of automatically added hreflang="en" ? I don't think I've encountered repos with translated README.MD (or any other) files yet which would make them only available in English.

strangernr7 avatar Mar 01 '21 09:03 strangernr7

Could do. I quite like the fact we're not explicitly listed sites (e.g. Mozilla or Wikipedia) but are doing it based in URLs containing /en/ or /en-US/ or https://en. so it's more generic (see below). Adding an explicitly site like github.com does make more of an assumption.


/*
 * Automatically adds language after an anchor if not same language
 *
 * Add for links that are obviously in English
*/
html:not([lang="en"]) main a[href*="/en-US/"]::after,
html:not([lang="en"]) main a[href*="/en/"]::after,
html:not([lang="en"]) main a[href^="https://en."]::after {
  content: '(en)';
  vertical-align: super;
  font-size: 0.6em;
}

/*
 * Add links if an explicit `hreflang` attribute exists
 */
main a[hreflang]::after {
  content: '(' attr(hreflang) ')';
  vertical-align: super;
  font-size: 0.6em;
}

/*
 * Remove it for English in English pages
 * (allows us to add this to base content to make it easier for translators)
 */
html[lang="en"] main a[hreflang="en"]::after {
  content: '';
}

It's the same for https://web.dev resources which are English (though they have launched Polish versions).

I think the better way is to make a bulk change to these at source in the English markdown files. This looks quite easy with Visual Studio code regex replace:

Search: \[([^\]]*)\]\((https:\/\/web.dev[^\)]*)\) Replace: <a hreflang="en" href="$2">$1</a> Files to include: content

And:

Search: \[([^\]]*)\]\((https:\/\/github.com[^\)]*)\) Replace: <a hreflang="en" href="$2">$1</a> Files to include: content

What do you think?

tunetheweb avatar Mar 01 '21 10:03 tunetheweb

What do you think?

I have no idea. I'm probably not the right person to comment on that seeing as I don't understand it completely 😁.

strangernr7 avatar Mar 01 '21 10:03 strangernr7

See #2047 - have a look at the Dutch ones and see if you agree.

tunetheweb avatar Mar 01 '21 12:03 tunetheweb

https://github.com/HTTPArchive/almanac.httparchive.org/blob/f3975a2f87d1b1ad32a4f2d4cb47f557a9a9ceaf/src/content/en/2020/mobile-web.md#L156 Is mSpeed a typo here or is it supposed to be like that?

strangernr7 avatar Mar 11 '21 12:03 strangernr7

Good question!

@spanicker I'm guessing this means "mobile speed" and do see the referenced report uses "mCommerce" in a similar way, so not sure if this is now a thing that passed me by, but maybe better to just spell it out as "mobile speed" to avoid any confusion since this is the only instance in the chapter and that term isn't used in the referenced report?

tunetheweb avatar Mar 11 '21 14:03 tunetheweb

Just a heads up: the 2020 ecommerce chapter has #jrharalson_bio: TODO above the Intro And I'm pretty sure featured_stat_label_1: Mobile sizes identified as ecommerce sites should be Mobile sites

strangernr7 avatar Mar 14 '21 22:03 strangernr7

Yeah we never got a bio from @jrharalson - Jason if you see this and wanna provide one then please do. It appears at the bottom of the chapter.

Agree on the featured stat correction.

tunetheweb avatar Mar 15 '21 11:03 tunetheweb

https://github.com/HTTPArchive/almanac.httparchive.org/blob/26509af2b51584e6fd597c2bd110670fa52b01fe/src/content/en/2020/caching.md#L121 Should Caching entity stay as is or be translated? As well as anywhere else in the chapter

strangernr7 avatar Apr 04 '21 12:04 strangernr7

I would translate it. It's not a technical term as such. Same for Eviction and Revalidation further down that list. On the other hand, Time to Live (TTL) is a well-known technical term so I wouldn't translate that,

tunetheweb avatar Apr 04 '21 14:04 tunetheweb