Alex Osborne
Alex Osborne
`/trove?q=type:urlquery+url:http://www.tisn.gov.au/Documents/CIPMA+tasking+application+form.doc` unhelpfully returns java.lang.ArrayIndexOutOfBoundsException (The problem is that the value of the url field needs to url-encoded twice. Once for the query string and once for opensearch.)
e.g. posting the string 'test-integration/test1.cdx' instead of a CDX file (forgetting @ in the curl command-line) currently shows: ``` $ curl -X POST --data test-integration/test1.cdx http://localhost:8080/myindex java.lang.ArrayIndexOutOfBoundsException: 1 At line:...
Requests for URLs with an image file extension (e.g. foo.gif) might return a HTML 404 error message. In this case HTTrack appears to write the error message to a file...
In 3.49-2 we have: ``` hts-cache/new.txt:11:21:41 185/185 ---M-- 301 error ('Moved%20Permanently') text/html date:Tue,%2009%20Jan%202018%2002:21:41%20GMT http://test.example.org/redirect test.example.org/redirect (from http://test.example.org/) Binary file hts-cache/new.zip matches hts-ioinfo.txt:[1] request for test.example.org/redirect: hts-ioinfo.txt:
SemanticPrecise changes "%C3%23" in the query string to "%ef%bf%bd%23" which is different from both python urlcanon ("%c3%23") and webarchive-commons ("%25c3%23"). See also: https://github.com/iipc/webarchive-commons/pull/102
See https://github.com/iipc/webarchive-commons/pull/102
pywb [now uses](https://github.com/webrecorder/pywb/pull/893) the slightly different encoding (e.g. JSON booleans instead of Python booleans) used in warcio.js.
## Describe the bug @ldko noticed that in client_side_replay mode there are failed requests for static assets hitting the ir_/ endpoint: 127.0.0.1 - - [2025-04-29 08:14:41] "GET /test/20250428230354ir_/http://localhost:8080/static/css/bootstrap.min.css HTTP/1.1" 404...
These can also redirect, not just 301 and 302. Fixes #564