pywb
                                
                                
                                
                                    pywb copied to clipboard
                            
                            
                            
                        Core Python Web Archiving Toolkit for replay and recording of web archives
Hey! Really appreciate this project & repo, been super fun to work with! Now, I've got a question I can't really seem to figure out the answer to, so I...
## Describe the bug Youtube Videos captured with Browsertrix not playable in pywb. ## Steps to reproduce the bug - Visit: https://webarchives.rhizome.org/youtube_embeds_5_1741774579/20250312101726/https://www.youtube.com/embed/n7ky-nuw-us / Or archive a youtube page (like https://www.youtube.com/embed/n7ky-nuw-us)...
## Describe the bug @ldko noticed that in client_side_replay mode there are failed requests for static assets hitting the ir_/ endpoint: 127.0.0.1 - - [2025-04-29 08:14:41] "GET /test/20250428230354ir_/http://localhost:8080/static/css/bootstrap.min.css HTTP/1.1" 404...
## Expected behavior The page at https://tnaqa.mirrorweb.com/ukgwa/20250313142443/https://waterinnovation.challenges.org/winners/ should load correctly with all content displayed as intended. The POST request to https://waterinnovation.challenges.org/wp-json/challenges/v1/ofwat_winners should return the necessary data to populate the page....
## Description I have replaced the individual comparison of the ids with a `test -w $VOLUME_DIR` resp. `! [ -w $VOLUME_DIR ]` check. ## Motivation and Context The comparison of...
## Expected behavior We are archiving a website that has had a few incarnations, the older archives have pages with a .html extension and the newer archives have /. I...
Currently pywb can add WACZ files to a collection via unpacking. The next step is to properly support WACZ files as-is.
We've used browertrix against a mediawiki instance, and we noticed that pages that have an apostrophe don't resolve properly when clicked in pywb. I loaded the same archive into replayweb.page...
## Describe the bug Docker container image does not exit gracefully no matter what I do to it. I am running a kubernetes pod in which I wanted to crawl...