browsertrix-crawler icon indicating copy to clipboard operation
browsertrix-crawler copied to clipboard

Implemented option for FullPage screenshot after the behaviours have run

Open fservida opened this issue 1 year ago • 3 comments

This PR implements a new option to take the FullPage screenshot after the behaviours have run instead of before. This allows to capture elements that might not be fully loaded until the whole page is visited.

I only implemented for fullpage as I think it is the only screenshot where this is useful and the only case where I would need it, as it can include content not currently in frame. Feel free to expand on that as needed.

Related to #486

Tested with custom docker, though I had to ensure setuptools v71 as else it would not build.

fservida avatar Jul 29 '24 12:07 fservida

Thanks for this - i think it would be clearer if it was part of the existing --screenshot flag, perhaps called fullPageAfterBehaviors - do you mind refactoring it to use that. That way can combine --screenshot fullPage,fullPageAfterBehaviors etc...

ikreymer avatar Aug 02 '24 20:08 ikreymer

Reopening after adding changes

fservida avatar Oct 07 '24 13:10 fservida

@ikreymer I've reimplemented the changes as suggested, hope this can be useful

fservida avatar Oct 07 '24 13:10 fservida

@fservida Looks like we need to run the auto-formatter before merging to resolve linting issues. Would you mind pushing a commit to this branch after running yarn run format:fix? Thanks! :)

tw4l avatar Nov 05 '24 16:11 tw4l

Thinking should rename it to fullPageFinal instead of fullPageAfterBehaviors, since behaviors may not run in some circumstances, and this is more consistent with final-to-warc setting with have for text

ikreymer avatar Nov 23 '24 08:11 ikreymer