core icon indicating copy to clipboard operation
core copied to clipboard

feat(plugin-sitemap): add sitemap plugin (close #337)

Open Mister-Hope opened this issue 4 years ago • 38 comments

Close #337

Mister-Hope avatar Jul 10 '21 04:07 Mister-Hope

Addtional: CI failed because of coverage, but since this plugin is deeply coupled with vuepress plugin api, it's hard to add some tests.

Mister-Hope avatar Jul 10 '21 07:07 Mister-Hope

Got it. i will update these days!

Mister-Hope avatar Aug 01 '21 10:08 Mister-Hope

@meteorlxy @Mister-Hope THANK You so much for doing great work is their any estimated release date that when this plugin will be released ?

ramesh-dada avatar Aug 03 '21 06:08 ramesh-dada

I am finishing all the changes, except I remain zh jsdocs in SitemapOptions.

IMO, as we are providing both chinese and english docs, it's reasonable to keep both the languages in the final options provided to users to make sure they get full hints.

Mister-Hope avatar Aug 05 '21 09:08 Mister-Hope

IMO, as we are providing both chinese and english docs, it's reasonable to keep both the languages in the final options provided to users to make sure they get full hints.

Remove them, or we need to add translations to all other comments.

meteorlxy avatar Aug 05 '21 09:08 meteorlxy

it's happy to know that we will support support sitemap recently.And also appreciate @Mister-Hope ‘s sitemap plugin. i will use your plugin until the official release version online

maodou38 avatar Aug 06 '21 02:08 maodou38

@Mister-Hope Before you merge this can you please take a look at this also https://github.com/vuepress/vuepress-next/issues/353

ramesh-dada avatar Aug 08 '21 05:08 ramesh-dada

I would prefer not to support any of the 2 fr in #353. For my personal reason, see https://github.com/vuepress/vuepress-next/issues/353#issuecomment-898859697

If you have different ideas and think any of them should be supported, just leave a message.

Mister-Hope avatar Aug 14 '21 07:08 Mister-Hope

why the pull request bog down?

maodou38 avatar Aug 25 '21 02:08 maodou38

why the pull request bog down?

I am a postgraduate student, just kind of busy in real life.

Mister-Hope avatar Aug 26 '21 01:08 Mister-Hope

Any progress on the feature of sitemap?

liziwl avatar Oct 05 '21 02:10 liziwl

What are the chances that this lands in the 2.0 release?

KnorpelSenf avatar Dec 09 '21 21:12 KnorpelSenf

Any update here?

jrcharles avatar Dec 31 '21 15:12 jrcharles

I will finish it once my winter vacation begins, just busy being a postgraduate student studying quantum physics

Mister-Hope avatar Jan 04 '22 06:01 Mister-Hope

@meteorlxy Should be ready

Mister-Hope avatar Jan 09 '22 06:01 Mister-Hope

Some explaination:

  • A lot of plugin option has been renamed
  • modifyTimeGetter is better with a Page arg, see plugin docs example
  • A short description of sitemap is added.

Mister-Hope avatar Jan 15 '22 14:01 Mister-Hope

@Mister-Hope Hi Just used it today. The feeling is the same as imagined, very good feeling, but at the same time there are problems and bugs.

  1. The priority configuration item in the document, (docs/zh/reference/plugin/sitemap.md: 85).Until I found that there is no SitemapOptions type, it means that the configuration will not take effect.This is very good design for normal pages.
  2. About robots.txt generate options. see: https://developers.google.com/search/docs/advanced/robots/robots_txt
  3. The excludeUrls default option not work. If I don't add the option manually, the 404 page will not be excluded image
  4. The excludeFrontmatter not work until I configured excludeUrls option.

Zhengqbbb avatar Jan 21 '22 07:01 Zhengqbbb

@Mister-Hope Hi Just used it today. The feeling is the same as imagined, very good feeling, but at the same time there are problems and bugs.

  1. The priority configuration item in the document, (docs/zh/reference/plugin/sitemap.md: 85).Until I found that there is no SitemapOptions type, it means that the configuration will not take effect.This is very good design for normal pages.
  2. About robots.txt generate options. see: https://developers.google.com/search/docs/advanced/robots/robots_txt
  3. The excludeUrls default option not work. If I don't add the option manually, the 404 page will not be excluded image
  4. The excludeFrontmatter not work until I configured excludeUrls option.

Hi, thanks for the feed back, I will have a look later on 1,3,4 , but I am not catching what you mean about 2. All the output folder is deployed directly, so if user set a base, there is no way for me to generate a robot.txt and place outside the dest folder. The robots.txt should be ignored by google. Also, I am not sure all the search engine won't read subfolder robot.txt. Besides, a sitemap plugin should not try to set allow and disallow for developers, in my case, I am having a valid robot.txt in my project public folder, and I only want the plugin appends sitemap link for me. If you have any idea (which is better), please point it out since I have no idea how to improve it now.

Mister-Hope avatar Jan 21 '22 10:01 Mister-Hope

@Mister-Hope Hi, thanks for the feed back, I will have a look later on 1,3,4 , but I am not catching what you mean about 2. All the output folder is deployed directly, so if user set a base, there is no way for me to generate a robot.txt and place outside the dest folder. The robots.txt should be ignored by google. Also, I am not sure all the search engine won't read subfolder robot.txt. Besides, a sitemap plugin should not try to set allow and disallow for developers, in my case, I am having a valid robot.txt in my project public folder, and I only want the plugin appends sitemap link for me. If you have any idea (which is better), please point it out since I have no idea how to improve it now.

ok, I just think that if there is automatic generate robots.txt, it is better to have a better place to configure. Maybe someone needs to disable the search spider to crawl their own website to prevent it from being website included, such as only for Google Spider, we can provide option to set user-agent. If you want to uniformly set recommended website inclusion and non-recommended website inclusion, you need Allow and Disallow.

Zhengqbbb avatar Jan 21 '22 11:01 Zhengqbbb

I will change the logic, only when base is / and user has a robot.txt in public folder, the plugin will try to add sitemap url to it. This should be better.

Mister-Hope avatar Jan 22 '22 02:01 Mister-Hope

And for 1, you should set priority in frontmatter.sitemap.priority and it's injected in the speard operator I think.

    const sitemapInfo: SitemapPageInfo = {
      changefreq,
      links,
      ...(lastmodifyTime ? { lastmod: lastmodifyTime } : {}),
      ...frontmatterOptions,
    }

Is this option problematic to you?

Mister-Hope avatar Jan 22 '22 02:01 Mister-Hope

The missing line in https://github.com/vuepress/vuepress-next/pull/277/commits/f043211d2cea84c83396cf1f91c1123fa4b7d22a should solve both 3 and 4 @Zhengqbbb Thanks for point out the bug.

Mister-Hope avatar Jan 22 '22 02:01 Mister-Hope

I will change the logic, only when base is / and user has a robot.txt in public folder, the plugin will try to add sitemap url to it. This should be better.

I also think this is the right design, whatever this is a sitemap plugin. But robot.txt is spider friendly for sites that don't submit sitemaps. For documentation, it would be better to mention that robot.txt needs to be added to the publish folder.

Zhengqbbb avatar Jan 22 '22 08:01 Zhengqbbb

And for 1, you should set priority in frontmatter.sitemap.priority and it's injected in the speard operator I think. Is this option problematic to you?

  • But you can see that it is not mentioned in the English document, but mentioned in the one asked in Chinese.
  • I think the significance of this option item is that for md files that do not declare formatter on normal pages, this is a unified source of priority. After all, everyone is not very willing to think about the priority of this page.

Zhengqbbb avatar Jan 22 '22 09:01 Zhengqbbb

I also think this is the right design, whatever this is a sitemap plugin. But robot.txt is spider friendly for sites that don't submit sitemaps. For documentation, it would be better to mention that robot.txt needs to be added to the publish folder.

The missing option in docs is added.

Mister-Hope avatar Jan 31 '22 09:01 Mister-Hope

And for 1, you should set priority in frontmatter.sitemap.priority and it's injected in the speard operator I think. Is this option problematic to you?

  • But you can see that it is not mentioned in the English document, but mentioned in the one asked in Chinese.
  • I think the significance of this option item is that for md files that do not declare formatter on normal pages, this is a unified source of priority. After all, everyone is not very willing to think about the priority of this page.

A description is added

Mister-Hope avatar Jan 31 '22 09:01 Mister-Hope

@meteorlxy Could you have another check?

Mister-Hope avatar Jan 31 '22 09:01 Mister-Hope

Looking forward to the sitemap plugin!

Jelledb avatar Feb 09 '22 08:02 Jelledb

I can confirm the sitemap plugin is working fine, pulled the code from the PR in as a "vendored" dependency and it works great for our site.

JohannesRudolph avatar Feb 28 '22 13:02 JohannesRudolph

@meteorlxy 👀大哥有空审一下吧,不要摸鱼了🤣

(For the rest, free to ignore this, I can't catch up a translation in English for 摸鱼)

Mister-Hope avatar Mar 03 '22 15:03 Mister-Hope