mdBook
mdBook copied to clipboard
Fix text from menu bar in Google Search
I noticed an issue that Google for some mdBooks in the search results shows a list of themes and the title of the book. It's a small thing, but an eyesore.
My guess is that Google thought #menu-bar is the part of the page content and indexed it. To prevent this, I replaced the tag of menu with <header>.
Thanks! Seems like it would be great to fix this. Do you happen to know if there is a way to test how Google generates the snippet? Or do you have any links to information about whether or not this will make a difference? All I could find is https://developers.google.com/search/docs/appearance/snippet?hl=en, which only really mentions adding a description.
@ehuss I also tried to find information, but didn't find anything. Experienced web-developers suggested that semantic layout is very important and Google somehow takes this into account when indexing.
Thanks for pointing this out @vklachkov! I could be wrong, but I don't think a <header> tag is necessarily excluded from snippets. There are two paths that I think will work:
- Put the
data-nosnippetattribute on the element that contains the navbar: https://developers.google.com/search/docs/crawling-indexing/robots-meta-tag#data-nosnippet-attr - Wrap the navbar in a
<nav>element, which is what rustdoc does. I can't find documentation that Google will definitely skip it for snippets, but semantically that makes sense. Also it may help with screen readers: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/nav#usage_notes
Sorry for the long wait, @jsha.
I think that although both nav and data-nosnippet will work, it is not very correct semantically.
To quote MDN:
The <header> element can define a global site header <...>. It usually includes a logo, company name, search feature <...>. It is generally located at the top of the page.
I think header tag would be more semantically correct. But if you insist, I can replace it with nav, as in rust doc.
That's interesting @vklachkov! Also from MDN, in the historical note section:
The
<header>element originally existed at the very beginning of HTML for headings. It is seen in the very first website. At some point, headings became<h1>through<h6>, allowing<header>to be free to fill a different role.
And indeed I was thinking of <header> in that old-school role.
I'm not a maintainer on mdBook, so I don't insist either way, and I don't have an opinion as to whether <nav> or (the modern semantics of) <header> is more semantically correct. There's no documentation of specific tags being skipped for Google snippets, other than the data-nosnippet property.
I suppose my recommendation to the mdBook team would be to go ahead and accept this as harmless and easy, and if it doesn't do the job @vklachkov could submit a followup PR with data-nosnippet or <nav>. :D
:umbrella: The latest upstream changes (possibly #2681) made this pull request unmergeable. Please resolve the merge conflicts.