httrack icon indicating copy to clipboard operation
httrack copied to clipboard

httrack is prepending to anchor href values in a broken way.

Open trusktr opened this issue 7 years ago • 1 comments

For example, on http://famous.org/docs, the links on the left are like this:

<a href="/docs/components/Position.html">Position</a>

but in the downloaded version httrack is outputting:

<a href="index.htmlcomponents/Position.html">Position</a>

As you can see, index.htmlcomponents/Position.html is wrong, and it breaks.

If I change the URL in my browser address bar from

file:///Users/jpea/Desktop/famous%20docs/famous.org/docs/index.htmlcomponents/Position.html

to

file:///Users/jpea/Desktop/famous%20docs/famous.org/docs/components/Position.html

then the file is not found, so it looks like httrack is not downloading the file.

The command I ran is:

httrack famous.org/docs famo.us/docs famous.org/examples/blocks/position/manifest.json famous.org/examples/blocks/align/manifest.json famous.org/examples/blocks/alignposition/manifest.json famous.org/examples/blocks/mountpoint/manifest.json famous.org/examples/blocks/origin/manifest.json famous.org/examples/blocks/positioncomponent/manifest.json -O ./ -n -T30 -B -v '-*slack.com*' '-*github.com*' '+*famous.org*' '+*amazonaws.com/*famo.us*' '+*.json' '+*.ttf'

Why is httrack outputting these broken URLs? How can we fix it?

trusktr avatar May 31 '17 02:05 trusktr

First of that menu is not coded in the webpage, its built using Jquery, so its not HTTrack doing it. its on the fly in your browser when you view the page

At first glance it looks like its NOT making the link in the normal way with a retaliative link "components/Position.html" but instead making an absolute link from current URL + "components/Position.html"

So on the site the current URL is /docs/ Which then makes the correct /docs/components/Position.htm

BUT on your Harddrive the current URL is /docs/index.html Which then makes /docs/index.htmlcomponents/Position.htm

This is simply the nature of a webserver VS a file system.

The the problem is that its on a harddrive, and should work if you put it up on any webserver.

The other problem is as you said HTTrack is not finding this jquery menu so its not downloading the pages.

mpheyse avatar Sep 06 '17 00:09 mpheyse