fink icon indicating copy to clipboard operation
fink copied to clipboard

web.archive.org links cause exceptions, or links is tranformed

Open gitressa opened this issue 6 years ago • 0 comments

See for example these two examples, which result in two different exceptions:

{"distance":1,"exception":"The base path \"\/web\/20050309042332\/http:\/\/www.fsk.dk\/fsk\/div\/interception\" is not an absolute path.","referrer":"https:\/\/example.org\/test","referrer_title":"http:\/\/web.archive.org\/web\/20050309042332\/http:\/\/www.fsk.dk\/fsk\/div\/interception\/aflytningcampbellsrapportpaadanskversion2.htm","referrer_xpath":"\/html\/body\/div[2]\/div\/section\/div\/section\/article\/div\/div\/div\/p\/a[2]","request_time":320473,"status":200,"url":"http:\/\/web.archive.org\/web\/20050309042332\/http:\/\/www.fsk.dk\/fsk\/div\/interception\/aflytningcampbellsrapportpaadanskversion2.htm","timestamp":"2019-06-09T20:01:07+02:00"}

{"distance":1,"exception":"The base path must be a non-empty string. Got: \"\"","referrer":"https:\/\/example.org\/test","referrer_title":"https:\/\/web.archive.org\/web\/20190525092213\/https:\/\/www.fsk.dk\/","referrer_xpath":"\/html\/body\/div[2]\/div\/section\/div\/section\/article\/div\/div\/div\/p\/a[3]","request_time":818032,"status":200,"url":"https:\/\/web.archive.org\/web\/20190525092213\/https:\/\/www.fsk.dk","timestamp":"2019-06-09T20:01:07+02:00"}

Formatted for easier reading:

{
  "link": "http://web.archive.org/web/20050309042332/http://www.fsk.dk/fsk/div/interception/aflytningcampbellsrapportpaadanskversion2.htm",
  "status": 200,
  "exception": "The base path \"/web/20050309042332/http://www.fsk.dk/fsk/div/interception\" is not an absolute path."
}
{
  "link": "https://web.archive.org/web/20190525092213/https://www.fsk.dk/",
  "status": 200,
  "exception": "The base path must be a non-empty string. Got: \"\""
}

Another oddity is that a link is checked as it is on one server, but on another server, the link is transformed. So this link is checked like this on one server: http://web.archive.org/web/20050309042332/http://www.fsk.dk/fsk/div/interception/aflytningcampbellsrapportpaadanskversion2.htm

But on another server is somehow transformed by Fink, and gets checked like this: http://web.archive.org/titlelist/eche//web/20050309042332/http://www.fsk.dk/fsk/div/interception/aflytningcampbellsrapportpaadanskversion2.htm ... where titlelist/eche/ is part of the URL.

This is the result from the server where the link is transformed: {"distance":3,"exception":null,"referrer":"https:\/\/example.org\/titlelist\/eche","referrer_title":"Listening:","referrer_xpath":"\/html\/body\/div\/div\/div\/section\/div[2]\/section[2]\/div\/div\/div\/div[1]\/span[2]\/div\/p[12]\/a","request_time":1562045,"status":404,"url":"http:\/\/web.archive.org\/titlelist\/\/web\/20050309042332\/http:\/\/www.fsk.dk\/fsk\/div\/interception\/aflytningcampbellsrapportpaadanskversion2.htm","timestamp":"2019-06-04T01:15:01+02:00"}

gitressa avatar Jun 09 '19 18:06 gitressa