py-wacz issues

Python in the read me file

2

The read me file, as far as I can see, focuses on the command line. I have a dozen WACZ files created to archive Facebook posts of political leaders during...

jburnford

question

AttributeError: 'NoneType' object has no attribute 'lower'

I'm trying to create a wacz from a warc.gz file. I want it to detect pages and create a full text index. This is my command: `python3 -m wacz create...

nvanderperren

Windows 10 truncates read path and prevents validation

When I try to validate a .wacz file using PowerShell in Windows 10 it fails to discover the .warc.gz file in the archive folder and therefore fails to validate the...

sbshep

better documentation via `wacz --help`

The documentation via the `wacz --help` command is far too brief. Mention all options so there is no need to always come back to this repository to consult documentation.

nvanderperren

enhancement

Add index generation system that uses offsets into the WACZ itself.

2

This proposed new `py-wacz` command allows you to generate a CDXJ file where the filenames and offsets refer to the WACZ itself rather than the WARC files within. The idea...

anjackson

Canonical method for converting multiple WARC files to WACZ

4

I'm not sure if this is a feature request or just a request for clarification, but I'm looking for a canonical way to generate a WACZ file from multiple WARC...

jackdos