polipo icon indicating copy to clipboard operation
polipo copied to clipboard

when downloading file the Internet cut down, we could not download this file again when Internet come back

Open qianguozheng opened this issue 9 years ago • 19 comments

I have polipo run on OpenWrt router , when I cut down the Internet connection on Route wan port while downloading big file via polipo, the file would never be downloaded unless I delete cache on local.

I have add some debug message to see what happen and review the code again and again, I just found polipo have the httpTimeout check on that case, and only for server will got timeout, the client's connect is right, @jech do you have any suggestion on that ?

what i am thinking is when the server timeout we do shutdown the client fd and destroy the object, but currently I couldn't find the client's fd when server fd timeout.

qianguozheng avatar Aug 05 '15 10:08 qianguozheng

do not use cache in polipo! For your case make sure to use HTTP CONNECT method and the last browser version. If you still fail, in Google Chrome press F12, select and activate Network tab and take a look at server's headers. If it doesn't support resuming - no client-side software will help you

netsafe avatar Aug 05 '15 17:08 netsafe

@netsafe I'm afraid there is some misunderstood.

  1. polipo is web cache server , if don't using its cache, I am afraid it is useless.
  2. I've already made it as transparent proxy with cache by reform the request.
  3. I don't know what you have said, for current, the case is when Internet suddenly cut down, the downloading operation may timeout on server socket, but on client side it still connect, this is the abnormal case, may not be taken into consideration during the polipo development.

could any one give some suggestion about how to implement such abnormal optimize it

qianguozheng avatar Aug 06 '15 02:08 qianguozheng

@qianguozheng Am I got you right, that you want help with pt.3 in your list? You want to fix this abnormal behaviour?

netsafe avatar Aug 06 '15 14:08 netsafe

@netsafe You got it. Actually I want to author know that polipo may not take such abnormal case into consideration, we need to implement an mechanism to make it work right

qianguozheng avatar Aug 07 '15 04:08 qianguozheng

it's not a polipo issue : you need to set proper tcp timeouts+maxconns+openfiles in all the config files and sysctl's, also do not forget to add 80,8080,443 and 8443 to long-living ports in Tor config - it will save your day. After tweaking all sysctl's in system-wide sysctl.conf make sure to perform a full reboot. Feel free to ask if you have any further questions!

netsafe avatar Aug 07 '15 13:08 netsafe

@netsafe WHY it is not polipo's issue ? shouldn't it check whether the response is a finished or not ? I think that it should set a flag to check whether the exact response it finished or not, else could cause the issue i mention above.

qianguozheng avatar Aug 18 '15 07:08 qianguozheng

It can't - under Linux at least - if the session timeout is TOO LONG. Setting explicit timeouts in app code is a very rare condition, unless you're in a great need in ESPECIALLY THIS kind of socket-tuning. It must be tuned on the host itself, host-wide, because it affects not just one program( polipo for example ) but the whole pack of processes running on the box. And to tweak such a parameters is an ABSOLUTELY snadart & basic SA every-day task

netsafe avatar Aug 18 '15 11:08 netsafe

I've actually had the same issue; while downloading large files, if the Internet connection becomes slow and/or unusable and the download is stopped it will fail to download the rest of the file properly on the second try (after connectivity is improved). It seems like Polipo downloads part of the file to its cache, and then on the second request from the client the partial file is served (but not the rest of it). The client just hangs, waiting around for Polipo to serve up the rest of the content.

brodyhoskins avatar Sep 02 '15 12:09 brodyhoskins

@brodyhoskins Have you tried to disable the cache in Polipo config and in HTTP request headers(the requests made to download the file)?

netsafe avatar Sep 02 '15 23:09 netsafe

No I haven't disabled the cache, because I'm using Polipo to do caching.

brodyhoskins avatar Sep 03 '15 21:09 brodyhoskins

@brodyhoskins try to disable cache in polipo and use a caching server like Squid, NGinx or Apache if the polipo caching is the problem here. Theese servers won't eat much resources, but their caching mechanisms are a way higher than polipo's one, IMHO

netsafe avatar Sep 04 '15 13:09 netsafe

I’ll look into those, especially if they compile on (jailbroken) iOS. But I’be grown found of Polipo. ^^

On Sep 4, 2015, at 6:03 AM, Alexey Vesnin [email protected] wrote:

@brodyhoskins try to disable cache in polipo and use a caching server like Squid, NGinx or Apache if the polipo caching is the problem here. Theese servers won't eat much resources, but their caching mechanisms are a way higher than polipo's one, IMHO

― Reply to this email directly or view it on GitHub.

brodyhoskins avatar Sep 05 '15 01:09 brodyhoskins

The main advantage of polipo caching, in theory at least, is it can cache partial downloads. None of those others you mention can do partial download caching. In fact, polipo is the only http cache I know of that claims to support caching of partial downloads*.

IMHO if polipo's caching doesn't work correctly, then there's not much point in using polipo at all.

I would argue that this is indeed a bug in polipo. If a fetch failed because the upstream connection was interrupted, polipo shouldn't behave like the fragment it has cached is the whole object. When the fetch fails it should either invalidate that cache entry, or (even better) cache it as a partial fetch.

  • Squid can kind of cache partial downloads if you tune the settings, but it does it by fetching and caching the whole object even in response to a small ranged fetch request.

On 4 September 2015 at 23:03, Alexey Vesnin [email protected] wrote:

@brodyhoskins https://github.com/brodyhoskins try to disable cache in polipo and use a caching server like Squid, NGinx or Apache if the polipo caching is the problem here. Theese servers won't eat much resources, but their caching mechanisms are a way higher than polipo's one, IMHO

— Reply to this email directly or view it on GitHub https://github.com/jech/polipo/issues/65#issuecomment-137730157.

Donovan Baarda [email protected]

dbaarda avatar Sep 05 '15 04:09 dbaarda

@brodyhoskins Apache and NGinx are compiling, I did it a long time ago. Can't tell the same about the Squid, but I see no problems

netsafe avatar Sep 05 '15 10:09 netsafe

@dbaarda the main feature of polipo is a VERY good and handy HTTP/HTTPS-CONNECT -> SOCKS connectivity. It helps alot

netsafe avatar Sep 05 '15 10:09 netsafe

True... it is a pretty good HTTP/HTTPS -> SOCKS proxy, and also HTTP 1.0 -> HTTP 1.1 proxy.

But I don't use it for that, and in theory it should also be a very good caching proxy, so it's sad minor bugs like this undermine it for that purpose.

On 5 September 2015 at 20:27, Alexey Vesnin [email protected] wrote:

@dbaarda https://github.com/dbaarda the main feature of polipo is a VERY good and handy HTTP/HTTPS-CONNECT -> SOCKS connectivity. It helps alot

— Reply to this email directly or view it on GitHub https://github.com/jech/polipo/issues/65#issuecomment-137939985.

Donovan Baarda [email protected]

dbaarda avatar Sep 06 '15 02:09 dbaarda

This looks like a bug in Polipo, no argument about that.

jech avatar Jan 31 '16 15:01 jech

@jech , yeah, I think it it a bug in Polipo. And for solution to this problem, should we discuss about and state for file downloaded ? if we do so, how should we judge a file is downloaded success or fail ? maybe it fail due to the network shutdown ? any thought about that ? thanks.

qianguozheng avatar Feb 01 '16 00:02 qianguozheng

Polipo already detects interrupted downloads, by either using the Length header or detecting unterminated chunked encoding.

jech avatar Mar 23 '16 23:03 jech