rendertron
rendertron copied to clipboard
Clarification regarding Googlebot
I've read https://github.com/GoogleChrome/rendertron/pull/258#issuecomment-453104073, but my understanding matches that of @barroudjo (see https://github.com/GoogleChrome/rendertron/pull/258#issuecomment-534944916). As far as I'm aware, JavaScript indexing occurs in a "second wave", and that can be delayed by a non-trivial amount of time. So, by default, content will not be indexed as fast as it could be if it was pre-rendered. This doesn't seem ideal?
What is the benefit of not pre-rendering for Googlebot? What is a "possibly richer version"? Aren't pages supposed to include the same data?
There's not really a second wave with a significant delay in this way, AFAIK. All pages go through the renderer, and the median time for that is apparently around the 5 second mark See Martin's Twitter Comment.
So IMHO in this respect, prerendering for googlebot doesn't introduce benefit by default, just fragility by the nature of doing something different, caches not being cleared, additional infrastructure to debug / remember about when debugging, and potentially more overhead for you catering for it.
I guess there are circumstances where you might be doing something googlebot doesn't quite get along with, where rendertron might make sense, but I would imagine these cases are getting fewer and further apart.
@dwsmart,
Thanks for linking to that Twitter comment, that makes me feel better about leaving Googlebot out of the list.
It seems that my understanding (based on this YouTube video) is outdated.
Perhaps there could be an FAQ entry like "Why isn't Googlebot included in the default list of user agents?".
Otherwise, feel free to close this issue (anyone with the permission to do so). 🙂
It kind of blows my mind that Googlebot isn't included by default if that truly is the case now? I've been using Rendertron for approximately 2 years now and I've always had Googlebot in my middleware list. Furthermore I pre-render ALL my pages (and invalidate and re-render when content changes) so that when Googlebot finally hits me the json is already ready to be served in a flash, I don't wait for the search engines to require a page before I pre-render it. And I have ~50k pages, all pre-rendered on an unlimited filesystem cache
According to Google Search Console I'm doing swimmingly well, have tons of my content indexed and all is good. I should note that my content is all loaded through AJAX queries. All is great in my world right now.. it seems like removing Googlebot would be completely contrary to the purpose of Rendertron o_O
Googlebot renders JavaScript, so adding it to the bot list is not necessary unless your JavaScript causes problems in certain scenarios.
In general, as we're deprecating the project, you should look into alternative approaches to rendering on the web.