webmagic icon indicating copy to clipboard operation
webmagic copied to clipboard

强转为HttpRequestWrapper出错

Open luchatex opened this issue 8 years ago • 8 comments

ERROR us.codecraft.webmagic.downloader.CustomRedirectStrategy(CustomRedirectStrategy.java:37) ## 强转为HttpRequestWrapper出错

Post请求发送json,返回经常出现这个问题.

luchatex avatar Jun 12 '17 01:06 luchatex

CustomRedirectStrategy可能有问题,能否给个出错的测试地址?

code4craft avatar Jun 17 '17 02:06 code4craft

使用代理的情况下,会出现强转失败,可能是因为 HttpRequestWrapper过时了吧.

luchatex avatar Jun 26 '17 01:06 luchatex

使用代理,遇到该ip失效,返回的是错误页面信息,强转request失败,造成该次请求丢失,没有添加到任务队列,报错, HttpRequestWrapper强转失败

07-02 13:47:21 DEBUG org.apache.http.impl.client.DefaultRedirectStrategy.getLocationURI(DefaultRedirectStrategy.java:142) - Redirect requested to location 'http://www.ivrn.net/warning/?n=20243&reason=3&s=16&id=11276903:8&ts=1498974427&str=BillingSet&referer=&cookie=&host=xxx.cn&url=/xxx/xxx&params=' 07-02 13:47:21 ERROR us.codecraft.webmagic.downloader.CustomRedirectStrategy.getRedirect(CustomRedirectStrategy.java:37) - 强转为HttpRequestWrapper出错

上面网址我用xxx替换了

luchatex avatar Jul 02 '17 09:07 luchatex

Httpclient 4.5.2 已经使用final HttpRequest redirect = this.redirectStrategy.getRedirect( currentRequest.getOriginal(), response, context); 不强转为 过时的 HttpRequestWrapper,建议修改下.

luchatex avatar Jul 02 '17 09:07 luchatex

HttpClient 中已有 方法 建议修正下. final int status = response.getStatusLine().getStatusCode(); if (status == HttpStatus.SC_TEMPORARY_REDIRECT) { return RequestBuilder.copy(request).setUri(uri).build(); } else { return new HttpGet(uri); }

luchatex avatar Jul 02 '17 14:07 luchatex

一年多还没有修复吗?

guwan avatar Apr 10 '18 09:04 guwan

这个问题还存在,为啥还不修复呢

bojiangwin avatar Nov 07 '18 03:11 bojiangwin

复制HttpClientDownloader,将其中的HttpClientGenerator同样复制修改,最后在新的Generator中同样复制修改一个RedirectStrategy,按照楼上的方法修改,暂时搞定了该问题

eee27 avatar Feb 11 '22 04:02 eee27