jekyll: fix sitemap lastmod
follow-up https://github.com/docker/docker.github.io/pull/15250#issuecomment-1197924756
fix sitemap last modification date not being taken into account:

it will use git and fall back to file's mtime if necessary.
also adds following entries to sitemap exclusion list that should not be there: https://docs-stage.docker.com/sitemap.xml

Signed-off-by: CrazyMax [email protected]
Deploy Preview for docsdocker ready!
Built without sensitive environment variables
| Name | Link |
|---|---|
| Latest commit | 7bba3a33fd33b0c545c360fea7885098fdefb32a |
| Latest deploy log | https://app.netlify.com/sites/docsdocker/deploys/631608b2f2b14c0008366e6b |
| Deploy Preview | https://deploy-preview-15267--docsdocker.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site settings.
preview: https://deploy-preview-15267--docsdocker.netlify.app/sitemap.xml
@thaJeztah @usha-mandya This PR would make search engines happy :slightly_smiling_face:
@thaJeztah Except your concerns with .git folder being added, it LGTY?
I made some comparison in our CI to see if there is a huge perf regression:
Last build on master branch: https://github.com/docker/docker.github.io/runs/8111429336?check_suite_focus=true#step:4:571
#15 45.91 done in 45.237 seconds.
#15 45.91 Auto-regeneration: disabled. Use --watch to enable.
#15 DONE 46.3s
This PR: https://github.com/docker/docker.github.io/runs/8116505924?check_suite_focus=true#step:4:577
#15 57.14 done in 56.239 seconds.
#15 57.14 Auto-regeneration: disabled. Use --watch to enable.
#15 DONE 57.7s
It seems acceptable to me and worth the change to fix our sitemap so search engines can consume it and make relevant decision on results.
We could also generate sitemap only on CI with a new env so we are not impacting developers but I think it's fine as it is atm.
Thanks Kevin. @thaJeztah Does it LGTY?
Deploy Preview for docsdocker ready!
| Name | Link |
|---|---|
| Latest commit | a16eff5814bc380c6379c31a87f4f6bc840f4c87 |
| Latest deploy log | https://app.netlify.com/sites/docsdocker/deploys/63592cc155308c00084648aa |
| Deploy Preview | https://deploy-preview-15267--docsdocker.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site settings.
Made some changes related to https://github.com/docker/docs/pull/15239#issuecomment-1200164942 so now remote resources are cached and fetched using Git so incremental builds are improved. This will speedup local builds by almost 62%:
Before:
#15 45.91 done in 45.237 seconds.
#15 45.91 Auto-regeneration: disabled. Use --watch to enable.
#15 DONE 46.3s
Now:
#17 17.84 done in 17.198 seconds.
#17 17.84 Auto-regeneration: disabled. Use --watch to enable.
#17 DONE 18.2s
With this new way to fetch remote resources we are now able to set the right last modification date for remote pages too. cc @usha-mandya @dvdksn
The incremental builds are really fast with this 🤩
But I noticed that the first build can take a really long time. Build time on main for a clean build is ~180 seconds for me, whereas in this branch it finished after ~240 seconds. I wonder if it's because we're cloning history? I tried setting git.clone(depth: 1) and it seems to fix the speed issue; I'm now back at ~180 seconds for a clean build. WDYT?
diff --git a/_plugins/fetch_remote.rb b/_plugins/fetch_remote.rb
index 4370baaccb..158fb3919a 100644
--- a/_plugins/fetch_remote.rb
+++ b/_plugins/fetch_remote.rb
@@ -55,11 +55,11 @@ module Jekyll
rescue => e
FileUtils.rm_rf(clonedir)
puts " Cloning repository into #{clonedir}"
- git = Git.clone("#{entry['repo']}.git", Pathname.new(clonedir), branch: entry['ref'])
+ git = Git.clone("#{entry['repo']}.git", Pathname.new(clonedir), branch: entry['ref'], depth: 1)
end
else
puts " Cloning repository into #{clonedir}"
- git = Git.clone("#{entry['repo']}.git", Pathname.new(clonedir), branch: entry['ref'])
+ git = Git.clone("#{entry['repo']}.git", Pathname.new(clonedir), branch: entry['ref'], depth: 1)
end
entry['paths'].each do |path|
I tried setting
git.clone(depth: 1)and it seems to fix the speed issue; I'm now back at ~180 seconds for a clean build. WDYT?
Ah good point yes we don't need the history anyway, will make this change :+1: