warc2zim icon indicating copy to clipboard operation
warc2zim copied to clipboard

Raise warnings when there is a conflict of http/https and/or ports and/or ...

Open benoit74 opened this issue 9 months ago • 0 comments

Do we want to raise a warning in the logs (or fail the scraper?) when we have two WARC records leading to the same ZIM Path, most probably due to a conflict of http/https URLs ?

Would be great if we can ensure the warning is displayed only when the resource is really different, but it is made hard by HTTP redirections.

Not sure it is really worth it (at least we have lots a debug message ""Skipping duplicate {url}, already added to ZIM", so this has to be analyzed in details.

benoit74 avatar May 24 '24 06:05 benoit74