caddy icon indicating copy to clipboard operation
caddy copied to clipboard

sometimes caddy closes connection to client prematurely

Open Anonymous-Coward opened this issue 4 years ago • 9 comments

$caddy --version:

Caddy v1.0.3

Possibly related to https://github.com/caddyserver/caddy/issues/1710

Caddy is used as a proxy, with some custom modules thrown in to handle a custom authentication mechanism.

What we see:

  • client sends request to caddy
  • caddy forwards request to backend app
  • backend app returns a chunked response
  • backend app sends a RST as soon as it finishes sending the chunked response, over multiple tcp packets
  • immediately, before finishing sending the chunked response to the client, caddy sends a RST to the client too, and no longer sends the remaining chunks of the response to the client

The consequence is that the client never gets the full response.

I can't easily provide the actual tcp dumps, since they're filled with information not meant for public visibility. I'll try to reproduce the bug using a synthetic test application. But in repeated captures reproducing the error, the RST from the backend app to caddy is frequently followed in the dump immediately by the RST sent by caddy to the client, with no packets captured in between. When there are packets between them, they're more often than not various unrelated ACKs.

Anonymous-Coward avatar Jul 07 '20 12:07 Anonymous-Coward

Thanks for the report.

First, can you please upgrade to Caddy 2? Caddy 1 is no longer being developed.

Then, can you please try to reproduce the problem without any custom modules/plugins -- this is crucial so that we can reproduce it reliably if we are to fix any bug.

Thanks! I'll reply with an issue template if we still need more information.

mholt avatar Jul 07 '20 13:07 mholt

  1. Is there some documentation that could help me set up and run stock caddy as a proxy locally? I'm not developing/maintaining the caddy-based solution, I just deploy it.

  2. Even if I manage to get things running locally, might be difficult to reproduce - I have to produce a pretty particular temporal sequence of sending and receiving packets from the client and from the server, and controlling that at the tcp level is pretty difficult, with the high level tools I'm using. Do you have any advice for how to achieve this?

Anonymous-Coward avatar Jul 07 '20 14:07 Anonymous-Coward

Is there some documentation that could help me set up and run stock caddy as a proxy locally?

Yep, in our documentation just click "reverse proxy": https://caddyserver.com/docs/quick-starts/reverse-proxy

Even if I manage to get things running locally, might be difficult to reproduce - I have to produce a pretty particular temporal sequence of sending and receiving packets from the client and from the server, and controlling that at the tcp level is pretty difficult, with the high level tools I'm using. Do you have any advice for how to achieve this?

Reduce the reproducible case down as minimal as possible. Start stripping things from the config and the build until the problem no longer occurs.

If it can't be easily reproduced, then the best bet is for you to troubleshoot it yourself and submit a patch if there's a bug (along with documentation explaining the problem).

mholt avatar Jul 07 '20 15:07 mholt

I have created a server and a client to reproduce the bug.

The attached archive contains a Java maven project together with a README file with detailed instructions on how to set up, build and run the bug reproduction scenario.

What I tested with was caddy 2.1.1, which I believe is the latest release.

Anonymous-Coward avatar Jul 10 '20 13:07 Anonymous-Coward

Hi @mholt can you confirm the error/bug? What are the next steps? Thx.

Andy Krieger [email protected], Daimler AG on behalf of Daimler TSS GmbH. Imprint

ANDKRI avatar Jul 15 '20 09:07 ANDKRI

Sorry, I've been focused on finishing the upgrades to the website, and got behind on issues and PRs. I'll circle back to this as soon as I can. Although, it might take a while, as there's quite a high barrier to reproducing the bug. It'll take me about a day to download and set up a VM with Java, etc, and that doesn't even include looking at the code and analyzing what is going on, which will take even more time. The repro instructions seem great, it's just... a lot of extra work. I suspect this will take me a few working days to make any sense of it. It's a lot of time to budget (voluntarily).

I think the fastest path right now since you are already set up to reproduce the issue is to start looking into the code and determine why what you're experiencing may be happening.

mholt avatar Jul 17 '20 22:07 mholt

I have created a server and a client to reproduce the bug.

The attached archive contains a Java maven project together with a README file with detailed instructions on how to set up, build and run the bug reproduction scenario.

What I tested with was caddy 2.1.1, which I believe is the latest release.

I've run your project and was able to reproduce it consistently with this:

java -jar target/caddybug-0.0.1-SNAPSHOT-single.jar \                                                        
--lines=100 \
--size=100 \
--duration=100 \
"http://localohost:2080/random?lines=10&size=40000&duration=1000"

It seems to be consistent when one end has shorter duration than the other. My hunch is, since there's simultaneous two-way copying, so if one side finishes, they'll try to close the connection. I'm attaching the pcaps from Wireshark of the 2 sessions when I run the reproduction code.

reproduction.pcapng.zip

two-sided-reproduction.pcapng.zip

I wonder if golang/go#15527 or golang/go#22209 are related.

mohammed90 avatar Jul 24 '20 16:07 mohammed90

I also encountered a similar problem.

daiaji avatar Aug 19 '20 12:08 daiaji

Quite likely this could be related to a bug upstream (based on what I've read around other issues in other repos): https://github.com/golang/go/issues/40747

mholt avatar Jul 22 '22 22:07 mholt

There's new information about this as per https://github.com/golang/go/issues/40747#issuecomment-1382404132

francislavoie avatar Feb 26 '23 04:02 francislavoie

This can probably be resolved by enabling full-duplex as of v2.7.0.

francislavoie avatar Aug 21 '23 01:08 francislavoie