pywb icon indicating copy to clipboard operation
pywb copied to clipboard

Post Request Body missing in index entry

Open mona-ul opened this issue 5 months ago • 0 comments

Describe the bug

Youtube Videos captured with Browsertrix not playable in pywb.

Steps to reproduce the bug

  • Visit: https://webarchives.rhizome.org/youtube_embeds_5_1741774579/20250312101726/https://www.youtube.com/embed/n7ky-nuw-us / Or archive a youtube page (like https://www.youtube.com/embed/n7ky-nuw-us) with browsertrix, and add it to pywb.
  • Open Dev Tools, Console
  • Click play
  • See error message for resource "https://www.youtube.com/youtubei/v1/player?prettyPrint=false", 404 not found.

Expected behavior

The player resource should return 200, as it is in the index and the warc file, and the video should play.

Issue

The problem lies with the pywb index and its entry of the player resource. The index entry of the player resource (https://www.youtube.com/youtubei/v1/player?prettyPrint=false) is missing the post request body in the url search key. When adding the post request body to the url search key, the resource can be found and the video is playable.

Environment

  • pywb (version 2.8.0)
  • Browsertrix-Crawler capture (1.5.8, with warcio.js 2.4.3)

Additional context

Forum Post

As descriped in the Forum post, the ArchiveWeb.Page capture of the Youtube page is working fine in pywb. The issue doesn't occure there (the pywb index is written correctly). Thats how the issue could be found: comparing the index entry of the working ArchiveWeb.page collection with the failing Browsertrix Collection.

Screenshots

Failing replay, player resource not found: Image

mona-ul avatar May 12 '25 09:05 mona-ul