httrack2warc icon indicating copy to clipboard operation
httrack2warc copied to clipboard

Non-200 status code handling

Open ato opened this issue 8 years ago • 0 comments

In 3.49-2 we have:

hts-cache/new.txt:11:21:41	185/185	---M--	301	error ('Moved%20Permanently')	text/html	date:Tue,%2009%20Jan%202018%2002:21:41%20GMT	http://test.example.org/redirect	test.example.org/redirect	(from http://test.example.org/)
Binary file hts-cache/new.zip matches
hts-ioinfo.txt:[1] request for test.example.org/redirect:
hts-ioinfo.txt:<<< GET /redirect HTTP/1.1
hts-ioinfo.txt:[1] response for test.example.org/redirect:

the new.zip comment entry has:

HTTP/1.1 301 Moved Permanently
X-In-Cache: 1
X-StatusCode: 301
X-StatusMessage: Moved Permanently
X-Size: 185
Content-Type: text/html
Last-Modified: Tue, 09 Jan 2018 02:21:41 GMT
Location: http://test.example.org/another
X-Addr: test.example.org
X-Fil: /redirect
X-Save: test.example.org/redirect

these are converted ok if hts-ioinfo is present. But without hts-ioinfo currently a resource record is created.

I don't think a cache entry is present at all in early versions of HTTrack. It might be possible to recreate redirects from the log messages though.

ato avatar Nov 01 '17 04:11 ato