pywb icon indicating copy to clipboard operation
pywb copied to clipboard

Core Python Web Archiving Toolkit for replay and recording of web archives

Results 189 pywb issues
Sort by recently updated
recently updated
newest added

I would like to know the original page source of a record. Is this possible? Let's say some google font is used on page https://example.com/about. A response or revisit record...

It would be nice if a revisit record has a WARC-Refers-To field as is recommended in the WARC specification. https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.0/#profile-identical-payload-digest

I have a recording for which I configured dedup_policy: revisit. The request record for a resource that has already been visited has a WARC-Concurrent-To field. Unfortunately that field value does...

## Is your feature request related to a problem? Please describe. I am working with filtered downloads of the Common Crawl dataset (~100TB, with plans to grow to ~200TB), so...

## Describe the bug When used to record an HTTP response that uses `Transfer-Encoding: chunked`, pywb produces a WARC record where the chunks have been decoded but the `Transfer-Encoding` header...

Is it possible to set the host_prefix variable when rewriting HTML with a custom value? Currently using the following setup. PYWB in docker container. Sits behind a nginx reverse proxy...

## Describe the solution you'd like I want to archive an old google site but it requires a Google login, but the webpage doesn't work on Pywb. I could access...

Hello there, I'd like to make a feature request - a way to list all URLs in `pywb`. [Web Archive Player](https://github.com/ikreymer/webarchiveplayer) has a similar feature, and it would be nice...

## Describe the bug When pywb rewrites `eval` it wraps it in a function call. Unfortunately this breaks code which declares function-scoped variables using `var` and then accesses them outside...

when i deploy to heroku it says application error here are the logs: ![image](https://user-images.githubusercontent.com/74786054/116933752-fcd31900-ac31-11eb-8d2e-5e7cfc4ea6d9.png)