pywb icon indicating copy to clipboard operation
pywb copied to clipboard

Problem: Documentation does not cover rules.yaml / HTML rewrite rules

Open rikkipitt opened this issue 2 years ago • 3 comments

Hi folks,

Is it possible to use rules to remove or update HTML elements within a page? If so, is there any documentation on utilising the rules.yaml?

For example, remove a specific script tag that is causing issues.

Cheers! Rikki

rikkipitt avatar Dec 07 '22 12:12 rikkipitt

Thanks @rikkipitt ! We don't have documentation on rules.yaml yet and as you've pointed out, it would be very useful. We will track progress on that in this issue.

As for your directly question on removing a script tag or other HTML element, I'll refer to @ikreymer for now.

tw4l avatar Jan 30 '23 21:01 tw4l

Hi @tw4l, really appreciate you getting back to me on this one! I'm still very much interested in finding out more about how to configure such things as it'll help in the maintenance of a project I'm building that utilises PYWB. Thank you!

rikkipitt avatar Jan 30 '23 21:01 rikkipitt

I have a similar question: I'm using pywb to playback an archived site and there's a broken image url on a subset of pages that I'd like to fix. I didn't actually find out about rules.yaml until I found this issue, although I know that rewriting was going on somewhere in the pipeline. Is there any way to configure additional rewrite rules from the pywb config file? I don't want to override the default rules because I'm pretty sure I still want that behavior, I just want to add an additional rule.

rlskoeser avatar May 20 '24 18:05 rlskoeser