PeerTube icon indicating copy to clipboard operation
PeerTube copied to clipboard

Google Search Indexing - Video is not the main content of the page

Open DVDGuy99 opened this issue 1 year ago • 19 comments

Describe the current behavior

Google Search Console gives the error "Video is not the main content of the page" when indexing videos on our PeerTube site.

This is one of the video pages that Google says the video is not the main content:

https://trailers.ddigest.com/w/sbKiNjKTCs8EkNpKb45ku9

(actually, I think it will say that for all the video pages - diving deeper into the video page indexing data, it says « Video is supplementary content on the page »)

Possibly related, but viewing a screenshot of the page generated within Google Search Console shows it displaying a "HLS.js does not seem to be supported" error where the video should be. Doing an exact term search for this on the videos section of Google shows quite a few PeerTube-hosted videos that have this as the crawled text description for the video.

Steps to reproduce

  1. Log in to Google Search Console account for the instance
  2. Under "Indexing" go to "Video Pages"
  3. A list of pages with the "Video is not the main content of the page" error should be listed here

Describe the expected behavior

As these pages are the main video playback pages, the video should of course be the main content of the page and Google should index these as such. Pages with videos that are indexed as the main content will show the video carousel with the video thumbnail as opposed to just a text link.

Additional information

  • PeerTube instance:

    • URL: https://trailers.ddigest.com/
    • Version: 6.0.3
    • NodeJS version:
    • Ffmpeg version:
  • Browser name, version and platforms on which you could reproduce the bug:

  • Link to browser console log if relevant:

  • Link to server log if relevant (journalctl or /var/www/peertube/storage/logs/):

DVDGuy99 avatar Feb 08 '24 11:02 DVDGuy99

Google bot seems to fail to load the HLS player (which is a non-sense). Trying to fallback to raw HTML element using https://github.com/Chocobozzz/PeerTube/commit/c4a062109d562cbe505c17044dd0b569a92ea121

Hope it will fix the issue (have to wait deploy on peertube2.cpy.re and re-schedule a google bot indexation)

Chocobozzz avatar Feb 23 '24 14:02 Chocobozzz

Seems like to fix the issue :+1:

Chocobozzz avatar Feb 26 '24 15:02 Chocobozzz

Has the problem been solved or is the same problem still present? Indexing-pages-with-videos-URL-inspection 1g

aflamrip avatar Mar 21 '24 17:03 aflamrip

Has the problem been solved or is the same problem still present?

Should be fixed in next peertube release (6.1.0)

Chocobozzz avatar Mar 25 '24 09:03 Chocobozzz

I think I found a solution but I don't know if it is right or wrong On this path peertube-latest\client\dist\standalone\videos
There is a file embed.html This part is modified to

But I don't know if this method will solve the problem Video is not the main content of the page

Video placement  Video is supplementary content on the page

Whether the page is a playback page for a single video (Video is main content on the page), or hosts additional meaningful content or videos (Video is supplementary content on the page).

aflamrip avatar Mar 25 '24 22:03 aflamrip

This issue seems to be still present in 6.1.0.

I've tried enabling/disabling web video, HLS with P2P support, and it doesn't seem to matter too much, as it still gives the "video is not the main content of the page" error:

google_search_console_not_main_content

Below is the JavaScript console error messages as shown in Google Search Console for a sample page (https://trailers.ddigest.com/w/1ZcXuBacku4tZeY7KPHwPF), including it in case it helps:

google_search_console_errors

DVDGuy99 avatar May 08 '24 06:05 DVDGuy99

Here's the video page indexing report for another video with web video enabled:

google_search_console_not_main_content2

DVDGuy99 avatar May 08 '24 06:05 DVDGuy99

It's a nonsense, sometimes Google considers the video is not the main content on the page, and a few days later it correctly indexes the video. I'll look into it again, but if anyone here has a any clue, here don't hesitate to share it

Chocobozzz avatar May 17 '24 07:05 Chocobozzz

This thread might shed some light, and I think there's a really stupid fix for all of this involving adding the word "video" to the URL:

https://support.google.com/webmasters/thread/247936417/how-to-fix-video-is-not-the-main-content-of-the-page?hl=en

There are only 4 videos on my site that have been indexed and the URL that is indexed is like this:

https://trailers.ddigest.com/videos/watch/218beda6-427d-4ba5-83ad-d815cd13fbc6

Whereas all the ones not indexed is like this:

https://trailers.ddigest.com/w/jtdUAPbo65bNgz4Momxmm4

I wonder if a separate Google sitemap can be created that uses the "video/watch" URL structure as opposed to the "w/" one. For now, Google doesn't seem to care if the first one redirects to the second one.

DVDGuy99 avatar May 20 '24 02:05 DVDGuy99

I've set up a cron job to create a version of the sitemap to be a workaround for this issue. The script basically replaces "https://trailers.ddigest.com/w/" with "https://trailers.ddigest.com/videos/watch/" in the sitemap, and then replaced the submitted sitemap in Google Search Console with this newly edited sitemap. This seems to work and videos are now being indexed, even though it shouldn't (as I'm submitting pages with redirects):

Screenshot 2024-06-01 141625

DVDGuy99 avatar Jun 01 '24 04:06 DVDGuy99

@DVDGuy99 Coming to the news: does google index all your videos with the new /videos/watch now?

Chocobozzz avatar Jun 14 '24 06:06 Chocobozzz

@Chocobozzz Yes, pretty much. It doesn't seem to re-add the videos that have already been indexed, even if they've been submitted via the sitemap. I'll try to force the reindexing (via the request indexing feature in Google Search Consoles) on a couple of older ones to see if they are also added/re-added.

DVDGuy99 avatar Jun 16 '24 12:06 DVDGuy99

The older videos I've requested reindexing for have also been indexed as videos (moved out of the "Video is not the main content of the page" category), as have all the new videos that are included in my modified sitemap.

DVDGuy99 avatar Jun 21 '24 02:06 DVDGuy99

Unfortunately it doesn't work on my side, indexing https://framatube.org/videos/watch/gW6BUFLNSDWWZwUzZBXLoN instead of https://framatube.org/w/gW6BUFLNSDWWZwUzZBXLoN is refused by google because of the redirection. I'm surprised it works on your instance :thinking:

Chocobozzz avatar Jun 27 '24 08:06 Chocobozzz

It's definitely a weird situation with Google at the moment, and I'm thinking it has to be a bug or something. I have several videos where, as you said, the page won't get indexed because it's a redirect, but the video (with the "/videos/watch/" URL) does get indexed. So it seems that video pages only get indexed if it has "video" in the URL even if it's a redirect. I'm going to submit the "/w/" version of the URL for these pages and see what happens - maybe the page gets indexed but the video indexing is removed due to the "not the main content" error.

DVDGuy99 avatar Jun 28 '24 12:06 DVDGuy99

So I managed to get both versions of the page indexed by Google. Don't ask me how it works, it's not supposed to, but it does for me.

Screenshot 2024-07-06 160014 Screenshot 2024-07-06 155950

DVDGuy99 avatar Jul 06 '24 06:07 DVDGuy99

According to Googles docs it's recommended to have a video:content_loc tag in the sitemap, which currently doesn't exist in Peertubes sitemap.

It's required to provide either a video:content_loc or video:player_loc tag. We recommend that your provide the video:content_loc tag, if possible. This is the most effective way for Google to fetch your video content files. If video:content_loc isn't available, provide video:player_loc as an alternative.

Another explanation may be those client logs whom seems to come from Googlebot:

{
    "tags": [
        "client"
    ],
    "userAgent": "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.6613.113 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)",
    "meta": "{\"currentTime\":0,\"data\":{\"type\":\"mediaError\",\"details\":\"manifestIncompatibleCodecsError\",\"fatal\":true,\"url\":\"https://cdn.peertube/streaming-playlists-native/hls/4bbb3e31-24fa-4ec4-8daa-ec6f1d54b4ef/bab5b5b4-884e-456e-9a84-fcd2f6a6e623-master.m3u8\",\"error\":{},\"reason\":\"no level with compatible codecs found in manifest\"}}",
    "url": "https://peertube/w/4bbb3e31-24fa-4ec4-8daa-ec6f1d54b4ef",
    "level": "error",
    "message": "Client log: HLS.js error: mediaError - fatal: true - manifestIncompatibleCodecsError",
    "timestamp": "2024-09-11T11:58:06.448Z"
}

It may also be worth a try to add more structured data to each watch page to convince Googlebot that it's really a watch page, not an article with a video. https://developers.google.com/search/docs/appearance/structured-data/video#examples

kontrollanten avatar Sep 16 '24 19:09 kontrollanten

According to Googles docs it's recommended to have a video:content_loc tag in the sitemap, which currently doesn't exist in Peertubes sitemap.

Why not, but I think most other web video platforms (youtube, vimeo...) don't include this tag but are still indexed :angry:

Another explanation may be those client logs whom seems to come from Googlebot:

I think it's an expected behaviour where Googlebot disabled video support in its engine.

Chocobozzz avatar Sep 17 '24 06:09 Chocobozzz

Why not, but I think most other web video platforms (youtube, vimeo...) don't include this tag but are still indexed 😠

Sure, but I think Googlebot makes some kind of holistic assessment where other platforms has higher general ranking, loads faster, is easier to crawl, etc. So if we try to perfect on all points, maybe it'll be indexed.

kontrollanten avatar Sep 17 '24 08:09 kontrollanten

Seems the issue is mostly fixed on our peertube instances. @kontrollanten, @DVDGuy99 can you confirm?

Chocobozzz avatar Feb 11 '25 09:02 Chocobozzz

Not at our instance, but we haven't installed PT 7 yet

kontrollanten avatar Feb 12 '25 05:02 kontrollanten

Most of our new videos are still being put into the "video is not the main content" category (it has been renamed "Video isn't on watch page" by Google). But I do find that after a while (several weeks to more than a month), some of these will be automatically categorised as being video pages. Right now, 60% of our video pages are indexed as videos, the 40% (including almost all the recently uploaded ones) is in the "Video isn't on watch page" category.

I've requested a fix validation by Google to see if this can clear up some of these pages, will post the results here.

DVDGuy99 avatar Feb 12 '25 23:02 DVDGuy99

The validation eventually failed, as the same error (now called "Video isn't on a watch page") still occurs. However, I do find over time some of these pages will become classified as video pages automatically. Here's Google's document on what is considered a "watch page":

https://developers.google.com/search/docs/appearance/video?sjid=7178086079256790892-NC#watch-page

One of my theories is that it's also a possibility that pages that have low visitor rates and incoming links may find it hard to be classified as a watch page. This is why when pages get more links and views, these then automatically get re-classified as a watch page by Google at a much later date.

DVDGuy99 avatar Mar 21 '25 03:03 DVDGuy99

One of my theories is that it's also a possibility that pages that have low visitor rates and incoming links may find it hard to be classified as a watch page. This is why when pages get more links and views, these then automatically get re-classified as a watch page by Google at a much later date.

I agree with this theory. I'm closing this issue but don't hesitate to comment it if you think we can improve things on PeerTube side, or find other interesting info 👍

Chocobozzz avatar Mar 21 '25 05:03 Chocobozzz

We've upgraded to v7.1.0 and asked google for a reindex. The reindexation failed to verify that it was correct though, it still fails due to "Video is not the main content". It gives no further information.

kontrollanten avatar Mar 31 '25 03:03 kontrollanten