website icon indicating copy to clipboard operation
website copied to clipboard

Etcd.io Docs/SEO Improvement Plan

Open jberkus opened this issue 5 years ago • 47 comments
trafficstars

I've been looking over the current etcd website and documentation, and feel that a refactor is warranted in order to improve them. Over the next few months, I plan to commit a significant part of my personal time to doing the below; I'm looking for the project's approval that these are all things that should get done. I also have the support of CNCF's web designer to help with this.

Objectives

To make etcd.io the primary destination for information about how to run, deploy, maintain, and scale etcd, both as a real resource and via search engines.

What's Working

The etcd documentation is automatically built based on the sources in the main development repository, so that we have regular updates when new versions are released. We have fully browesable per-version documentation that contains a lot of information about etcd code and internals.

What's Not Working

Currently etcd.io is often not the first hit when searching on common etcd tasks, such as "etcd setup", "how can I scale etcd", or "etcd troubleshooting". Instead, various random other pages are, some associated with our project (such as the CNCF blog) and some not (Rancher.com or Portworx pages). Importantly, because the most findable sources are not part of the official etcd documentation, the information on them is often out of date.

Part of the reason for this is that the existing documentation is not very browseable, and is structured in a way to give it poor SEO. It is also difficult for beginning users to find what they need. This is something that we can fix.

Improvement Plan

Short Term

Home Page Improvements

Add the following static pages to the website, linked off a section on the home page:

  • Documentation TOC - page with an annotated TOC for the documentation and direct links to important pages.
  • Download Etcd - page with links to sources, packages, and containers for getting Etcd and installing it (cross-link to 3,4)
  • Getting Started with Etcdctl - page with simple instructions on how to install and get started with Etcdctl, including authentication to existing clusters.
  • Getting Started with Etcd - page with simple, 80% solution, instructions on installing Etcd on a Linux server.
  • Troubleshooting Etcd - page with lots of links to things like how to analyze errors, how to file a bug, etc. Add links to relevant blog posts as they get written.

Each of these static pages will live in the website repo rather than docs; as much as possible they will be somewhat version-agnostic. All basics will be covered on the static page, with links off of them only for more in-depth review or unusual circumstances (e.g. "etcdctl kerberos authentication").

Documentation Content

Replace TOC for documentation navigation with one that makes more logical sense from a "user journey" point of view. This will be the same TOC displayed on the static page. At a top level, this would mean replacing the existing sections with the following:

  • Getting Started (some of which links back to static pages, or copies of those pages)
  • Administration (Operations Guide + Upgrading + Platforms + some other pages which are operational, like Metrics)
  • Developer Guide (Developer guide + Learning Etcd)
  • Contributor Guide (Currently would include Reporting Bugs and Issue Triage, but more content later)
  • Additional Information (FAQ + Benchmarking + some other pages)

Each section will be reworked so that it follows some kind of logical order, for example from installation, to growing the cluster, to upgrading. Note that this may require adding additional required header information into existing documentation files.

We also need to fix multi-version navigation. That is, switching "versions" of the documentation should automatically navigate to that page in the selected version of the docs, if that page exists. If this is challenging to do with the documentation toolchain we have, re-evaluate whether we want to keep versioned documentation.

Blog

Encourage weekly blog posts of unique content, mostly alternating how-to articles and release updates (patches, etc.). The new content here will drive some traffic as well as supply material for future documentation.

Cleanup

Fix 404s: we have a lot of lost pages. These need to get replaced with redirects. Shorten redirect chains (not sure how many of these we have) Add a Documentation tag to both the Etcd and Website repos so that we can start tracking docs issues.

Ongoing/Long Term

Blog Transfer

As we accumulate extra "How To" information into our blog, this content will be reworked in order to create a "Solutions" section for the documentation. This will be like the Tasks section in the Kubernetes documentation. Such solution-oriented documentation tends to attract links from developer sites. To the extent possible, to support this, we will develop and maintain a "blogs wanted" list of solutions for which we'd like a blog post. Documentation Content

Add a review checklist of documentation pages to be reviewed with each version release. This helps prevent staleness and inaccuracy. Or, if we don't have time to review them, at least we have a TODO list for contributors.

Build out the contributor section, encompassing the information in Contributing.md, and add information specifically about contributing to the documentation. Create and build out a Solutions section per the above.

Documentation Contributors

Launch a long-term campaign to attract documentation-only contributors in order to reduce burden on the primary maintainers. Such a campaign would include:

  • Clear documentation on how to contribute to the docs.
  • Ensuring that documentation patches get reviewed and applied (or rejected).
  • A path to documentation contribution seniority/authority, so that doc contributors can review doc contributions.
  • Doc sprint events at Kubecon and/or WriteTheDocs

jberkus avatar Apr 08 '20 23:04 jberkus

Comments? Corrections? If this is a good plan, can the project owners approve it?

jberkus avatar Apr 21 '20 16:04 jberkus

Did you send this to etcd-dev? Very few people subscribe to notifications on this repo IIRC. You can cc [email protected]. too.

philips avatar Apr 21 '20 17:04 philips

adding @lucperkins

spzala avatar May 14 '20 01:05 spzala

Thanks @jberkus Improving doc as you detailed sounds great to me! Wondering about couple of things

  • is the refactoring/improvement plan you proposing modeled after any existing CNCF project?
  • are there any standardization among CNCF projects for docs/docs repo in look and feel, search engine result, content requirements, contribution practices etc?

spzala avatar May 14 '20 01:05 spzala

A) No, it's based on feedback from a docs specialist and an SEO specialist B) There purposefully is not. That's one of the areas where the CNCF is very specific that they do not control or set requirements.

jberkus avatar May 16 '20 00:05 jberkus

Thanks!

spzala avatar May 16 '20 01:05 spzala

https://github.com/etcd-io/etcd/issues/12180 - doc improvement consideration related

spzala avatar Jul 28 '20 16:07 spzala

Hi 👋, I'm a Developer Advocate for documentation over at the CNCF, and I'd like to help with this project.

nate-double-u avatar Nov 17 '20 23:11 nate-double-u

@nate-double-u thank you, that would be great! Would you like to compile the kind of improvements that you think of? I can schedule a meeting in the first week of December with maintainers. This week is KubeCon. We have a monthly meeting next week but it's a KubeCon week. Please feel free to ask any questions meanwhile. Thanks! /cc @xiang90 @gyuho @jpbetz @jingyih

spzala avatar Nov 18 '20 00:11 spzala

Meeting at the first community meeting in December would be awesome.

jberkus avatar Nov 18 '20 23:11 jberkus

I agree, the first community meeting in December would work for me as well.

nate-double-u avatar Nov 18 '20 23:11 nate-double-u

@jberkus @nate-double-u - sounds great! Thanks!

spzala avatar Nov 19 '20 01:11 spzala

Thanks, @nate-double-u @jberkus @spzala , I will add that to community meeting agenda.

wenjiaswe avatar Nov 19 '20 18:11 wenjiaswe

Awesome, thank you @wenjiaswe !!

spzala avatar Nov 19 '20 20:11 spzala

During the planning discussions for this improvement work, the idea of migrating the documentation from the main code repo to the website repo came up.

I think that if we want to do this, we should do it as a part of this project.

I also think we should do this. There are some pros and cons, but I think that the positives outweigh the negatives.

Migration pros

  • Single sourcing content - the code in the website will be the source of truth, and we would no longer have to worry about reconciling the versions if they come out of sync
  • We can build a continuous delivery pipeline so the latest content is available
  • Versions can be managed by copying the current latest content folder
  • SEO (to be confirmed) I understand that Google indexes code in the master branch of public repositories. If true, this means the same content is getting indexed in two places - the code repo and the website itself. (For this, and other reasons, we may want to change the website repo's default branch to "main" as well)
  • Changes to the website would be able to be made w/o making a PR to the main code repo.

Migration cons

  • Currently a feature PR can include documentation
  • The developers are used to the current system

Example sites

To see how this could look, here are some CNCF graduated projects that have their docs living in a separate website repo:

  • https://github.com/kubernetes/website/
  • https://github.com/goharbor/harbor/tree/master/docs
  • https://github.com/helm/helm-www

Each project in the CNCF does things differently though, so there are examples of projects keeping their docs in their code repos too.

nate-double-u avatar Jan 06 '21 00:01 nate-double-u

As a part of this Etcd.io Docs/SEO Improvement Plan, I'd like to suggest that we migrate the site to use the Hugo docsy theme.

Currently, anything that is not from the code repo's Documentation folder, or in the blog, is hardcoded into the layout html files, or has copy embedded in the config.toml file.

I think that reorganizing the site layout templates using the Hugo Docsy theme will make it easier to evolve and maintain the site and achieve the asks outlined

Outcomes

  • New pages can be .md files, which are easier than html files to maintain for non-coders
  • We can remove content from config and html files and put it into .md files
  • The site becomes much less bespoke, meaning we can leverage existing expertise of other CNCF/Linux Foundation sites (such as the Kubernetes website)
  • Docsy provides a path for localization/translation (when the time comes)
  • Using a theme will make maintenance easier (for instance, we'll be able pull in patches made to the docsy theme's upstream repo)

Process/Possible prep work

If we agree that this is a viable path forward, I'd like to do the docsy migration before building the rest of pages that @jberkus has outlined in the Documentation Content section above.

  1. Open an umbrella docsy migration issue
  2. Create a docsy branch
  3. Add docsy submodule to main
  4. (Consider) moving current site pages to content/en
  5. ... others as we move the process forward

nate-double-u avatar Jan 07 '21 21:01 nate-double-u

Some docs are generated, e.g. from protos: https://github.com/etcd-io/etcd/blob/69e99e80fa02a9120710039222ef085c6c36ea27/scripts/genproto.sh#L98,

so part of this script probably would need to be hosted in the 'docs' repo and pull source files from it.

ptabor avatar Jan 14 '21 21:01 ptabor

Some docs are generated

Good to know, thanks @ptabor!

nate-double-u avatar Jan 14 '21 23:01 nate-double-u

Based on a conversation with @chalin (who worked on the gRPC Docsy Conversion), I think the way I'd like to plan the next phases of work on this improvement plan are as follows:

  1. Migrate/consolidate the existing documentation into the etcd-io/website repo. (https://github.com/etcd-io/website/issues/93) 1.a. Clean up the existing site architecture. (Bring prose out of config files, layout file, etc. and make the site generally more hugo-like)
  2. Design new information architecture. Based on the "Documentation Content" section of the plan laid out by @jberkus above. This is where we'd decide which new pages need to be built, and what content they would have.
  3. Plan the Docsy migration.
  4. Migrate to Docsy (https://github.com/etcd-io/website/issues/94)
  5. Build new pages based on the architecture designed in step 2.

I'm not yet sure of the order of steps 4 and 5, I think there may be some advantage to doing them together. Things may become more clear as we do the planning. The cleanup phase may also help us understand better how the Docsy migration could go.

/cc @zacharysarah

nate-double-u avatar Jan 22 '21 21:01 nate-double-u

I've put together a spreadsheet for tracking tasks and estimating time.

https://docs.google.com/spreadsheets/d/1-ZQMPc_eQ0fh1pwOHv3NltwT-ifKA5UcFePfdFiWn8I/edit?usp=sharing

I'm hoping each line item can be made into an issue here on GitHub. It's not entirely complete yet, but as we move through the process we can add to and change it. I'm always happy for suggestions and feedback!

nate-double-u avatar Jan 27 '21 18:01 nate-double-u

I'm happy to say that PR https://github.com/etcd-io/website/pull/244 has been merged in, and we're now running the Docsy theme on etcd.io

The improvement plan continues! 🙂

nate-double-u avatar Apr 26 '21 17:04 nate-double-u

make etcd.io the primary destination for information about how to run, deploy, maintain, and scale etcd, both as a real resource and via search engines.

What should be done (if anything) with https://github.com/etcd-io/etcdlabs, which is live at http://play.etcd.io/home?

chalin avatar May 06 '21 15:05 chalin

As @chalin and I will be migrating off the etcd.io project, I want to put together an update as to where the project stands and what still needs to be done. I’ll quote @jberkus’ original text as needed, but will prune duplications or parts that are answered in other sections. As there is quite a bit here, I’ll break this into several posts.

Milestone trackers

Tracking Issues

  • #267
  • #247

nate-double-u avatar Jun 25 '21 16:06 nate-double-u

Objectives

To make etcd.io the primary destination for information about how to run, deploy, maintain, and scale etcd, both as a real resource and via search engines.

  • [x] Patrice (@chalin) has set up some analytics, perhaps we should publish a link to it on the site like K8s does #387.

What's Not Working

Currently etcd.io is often not the first hit when searching on common etcd tasks, such as "etcd setup", "how can I scale etcd", or "etcd troubleshooting". Instead, various random other pages are, some associated with our project (such as the CNCF blog) and some not (Rancher.com or Portworx pages). Importantly, because the most findable sources are not part of the official etcd documentation, the information on them is often out of date.

  • [x] We are the top results now.

nate-double-u avatar Jun 25 '21 16:06 nate-double-u

Improvement Plan Short Term Home Page Improvements

Add the following static pages to the website, linked off a section on the home page:

* Documentation TOC - page with an annotated TOC for the documentation and direct links to important pages.
  • [ ] See next comment (https://github.com/etcd-io/website/issues/65#issuecomment-868695284)
* Download Etcd - page with links to sources, packages, and containers for getting Etcd and installing it (cross-link to 3,4)
  • [x] I think the Install page covers this, and can be updated if not.
* Getting Started with Etcdctl - page with simple instructions on how to install and get started with Etcdctl, including authentication to existing clusters.
  • [ ] Write "Getting Started with etcdctl" page #394
  • [ ] Also, a potentially related open issue: Add etcdutl documentation #309
* Getting Started with Etcd - page with simple, 80% solution, instructions on installing Etcd on a Linux server.
* Troubleshooting Etcd - page with lots of links to things like how to analyze errors, how to file a bug, etc.  Add links to relevant blog posts as they get written.
  • [ ] Write "Troubleshooting etcd" page #395

Each of these static pages will live in the website repo rather than docs; as much as possible they will be somewhat version-agnostic. All basics will be covered on the static page, with links off of them only for more in-depth review or unusual circumstances (e.g. "etcdctl kerberos authentication").

  • [x] These pages have been mostly added under versioned folders, as while they may not change much when they do change, they're likely to change with the version.

nate-double-u avatar Jun 25 '21 16:06 nate-double-u

Documentation Content

Replace TOC for documentation navigation with one that makes more logical sense from a "user journey" point of view. This will be the same TOC displayed on the static page. At a top level, this would mean replacing the existing sections with the following:

There hasn’t been a new table of content page made, but the existing one, for each version since v3.4 has been reordered based on discussion about what best the info should be presented in (see Documentation Content: TOC — Compilation PR (weights & descriptions) https://github.com/etcd-io/etcd/pull/12575, and Adding weights and descriptions to v3.4 from next #168)

* Getting Started (some of which links back to static pages, or copies of those pages)
  • [x] Quickstart page for etcd #278
* Administration (Operations Guide + Upgrading + Platforms + some other pages which are operational, like Metrics)
  • [ ] Create Administration page #401 as per #267
* Developer Guide (Developer guide + Learning Etcd)
  • [ ] The New IA implementation #267 issue needs to be updated to reflect this.
* Contributor Guide (Currently would include Reporting Bugs and Issue Triage, but more content later)
  • [ ] The Contributor Guide is slated to be moved under the Community page, and the Issue Triage and Reporting Bugs pages with it. See #267
* Additional Information (FAQ + Benchmarking + some other pages)
  • [ ] The New IA implementation #267 issue needs to be updated to reflect this.

Each section will be reworked so that it follows some kind of logical order, for example from installation, to growing the cluster, to upgrading. Note that this may require adding additional required header information into existing documentation files.

  • [x] #168, #213, #214

We also need to fix multi-version navigation. That is, switching "versions" of the documentation should automatically navigate to that page in the selected version of the docs, if that page exists. If this is challenging to do with the documentation toolchain we have, re-evaluate whether we want to keep versioned documentation.

  • [x] Multi-version navigation has been fixed. (Documentation Content: Versioning (Fix multi-version navigation) #86)

nate-double-u avatar Jun 25 '21 16:06 nate-double-u

Blog

Encourage weekly blog posts of unique content, mostly alternating how-to articles and release updates (patches, etc.). The new content here will drive some traffic as well as supply material for future documentation.

Cleanup

Fix 404s: we have a lot of lost pages. These need to get replaced with redirects.

  • [x] These have mostly been fixed: #174, #203, but a few broken links remain (#245).

Shorten redirect chains (not sure how many of these we have)

  • [x] All redirects have been moved to the layouts/index.redirects file. There aren't an unmanageable number of them.

Add a Documentation tag to both the Etcd and Website repos so that we can start tracking docs issues.

I had added a 'documentation' tag to the website repo, but have since removed it. There were very few instances of it being used, and by default issues being opened on the website repo are relating to documentation so it was a bit redundant.

Ongoing/Long Term

Blog Transfer

As we accumulate extra "How To" information into our blog, this content will be reworked in order to create a "Solutions" section for the documentation. This will be like the Tasks section in the Kubernetes documentation. Such solution-oriented documentation tends to attract links from developer sites. To the extent possible, to support this, we will develop and maintain a "blogs wanted" list of solutions for which we'd like a blog post.

Blog work noted in the process of reorganizing the site:

  • [x] Convert Platforms/Amazon Web Services page into a blog post #396
  • [x] Convert "Platforms/Container Linux with systemd" page into a blog post #397
  • [x] Convert "Platforms/FreeBSD" page into a blog post #398
  • [ ] Convert @marifse's Platforms/IBM PR into a blog post #399

nate-double-u avatar Jun 25 '21 16:06 nate-double-u

Suggested Additions to the Improvement Plan

During the planning discussions for this improvement work, the idea of migrating the documentation from the main code repo to the website repo came up.

  • [x] This was completed with PR: Migrate documentation: Add docs to etcd-io/website #99

As a part of this Etcd.io Docs/SEO Improvement Plan, I'd like to suggest that we migrate the site to use the Hugo docsy theme.

  • [x] This was completed with PR: Docsy theme #244

nate-double-u avatar Jun 25 '21 16:06 nate-double-u

Thank you for your terrific work on this. The reorganization you've done will make it possible for us to maintain the docs better and in a more searchable format.

jberkus avatar Jun 28 '21 23:06 jberkus

Yea, thank you. The site looks great.

On Jun 28, 2021, at 4:18 PM, Josh Berkus @.***> wrote:

 Thank you for your terrific work on this. The reorganization you've done will make it possible for us to maintain the docs better and in a more searchable format.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

philips avatar Jun 28 '21 23:06 philips