wget-2-zim issues

Suggestion: Ability to specify download directory

It would be convenient to be able to specify a download directory when running the script (e.g. with `-d` flag or something), rather than having it download to the directory...

eoin-cr

wget: Cannot write to page (Is a directory).

1

I've noticed some `index.html` files were missing after scraping a site with your script. Seems the problem is that if wget downloads some ~binary~ files to a directory then a...

ilyaigpetrov

[Feature] Show progress during `find $DOMAIN` in `postwget`

`find $DOMAIN -type f $ -name '*.htm*' -or -name '*.php*' $ -exec "$iterscript" "$DOMAIN" '{}' "$EXTERNALURLS" "$WGETREJECT" "$NOOVERREACH" -not -path "./$DOMAIN/wget-2-zim-overreach/*" \;`: https://github.com/ballerburg9005/wget-2-zim/blob/6f83e1125fe3cc09be4e2f1bc2c1fdc42959cc66/wget-2-zim.sh#L209 This line produces no debug output and...

ilyaigpetrov

Suggestion: Option to exclude certain webpaths from being crawled, or at least written, maybe both

2

For example in my run of http://www.someweb.com I would like to exclude all of http://www.someweb.com/boringnotes/ from being crawled/written since there is nothing of interest to me there.

5000thinmints

wontfix

Broken on debian 11 and 12

1

currently building zim-tools from github requires a version of libzim that debian does not distribute, and thus building zim-tools on a debian machine results is not possible (or at least,...

bejp0a

wget-2-zim
wget-2-zim copied to clipboard

Metadata

Suggestion: Ability to specify download directory

wget: Cannot write to page (Is a directory).

[Feature] Show progress during `find $DOMAIN` in `postwget`

Suggestion: Option to exclude certain webpaths from being crawled, or at least written, maybe both

Broken on debian 11 and 12

← Metadata

Owner

Metadata

wget-2-zim wget-2-zim copied to clipboard

Metadata

Suggestion: Ability to specify download directory

wget: Cannot write to page (Is a directory).

[Feature] Show progress during `find $DOMAIN` in `postwget`

Suggestion: Option to exclude certain webpaths from being crawled, or at least written, maybe both

Broken on debian 11 and 12

← Metadata

Owner

Metadata

wget-2-zim
wget-2-zim copied to clipboard