nchan icon indicating copy to clipboard operation
nchan copied to clipboard

nchan_publisher_upstream_request without proxy_pass = crashes

Open sobitcorp opened this issue 4 years ago • 3 comments

Adding a nchan_publisher_upstream_request directive pointed to a location without a proxy_pass directive leads to unpredictable behavior of nchan. For example, consider this configuration:

	location = /reply {
		return 304;
	}
	location ~ /nchantest1/(\w+)$ {
		nchan_pubsub;
		nchan_channel_id $1;
		nchan_publisher_upstream_request /reply;
	}

On the client side, I run:

ws = new WebSocket('wss://example.com/nchantest1');
ws.send("test");

I expect the message to simply be echoed since nchan should be getting a 304 upstream response, instead 1 of 4 things happen (seems to depend on the length of the sent string):

  • The message is echoed successfully (rare)
  • Websocket disconnects (sometimes with a seemingly random 1xxx error code)
  • Nginx worker thread segfaults: [alert] 11246#11246: worker process 11700 exited on signal 11
  • Websocket disconnects and this line is logged to the Nginx error log (verbatim, without a timestamp): "ker process: nchan-1.2.7/src/subscribers/websocket.c:575: websocket_publish_upstream_handler: Assertion 'd->subrequest' failed."

What I'm actually trying to do is pass the message upstream to PHP via fastcgi_pass (e.g. fastcgi_pass directive in my /reply location). Doing this leads to the behavior above.

The roundabout way of proxy_passing the request to myself works fine. E.g. this works:

	location = /reply {
		proxy_pass http://localhost/reply2;
	}
	location = /reply2 {
		return 304;
	}
	location ~ /nchantest1/(\w+)$ {
		nchan_pubsub;
		nchan_channel_id $1;
		nchan_publisher_upstream_request /reply;
	}

This, however causes unwanted access logging and, probably, performance degradation when I'll have many messages being published.

So is what I'm trying to do simply unsupported and my only choice is to use the roundabout way of proxy_passing messages back to my own server, or am I just doing something wrong?

Thanks for any advice on this.

sobitcorp avatar Oct 24 '20 08:10 sobitcorp

I'm aware of this isssue, and there's no good way around it without seriously mucking around with Nginx internals. I'm currently focused on Nchan 2, and because there is an acceptable workaround, I'm not really willing to spend time fixing this.

For now, please use proxy_pass, annoying as it may be. This issue will be addressed in Nchan 2.

slact avatar Oct 24 '20 18:10 slact

Thanks for the info!

Unfortunately, I've managed to get (very intermittent) segfaults using proxy_pass as well (e.g. my 2nd config scenario in the original post). Unlike the scenario without proxy_pass which almost always caused unpredictable behavior, here all works fine about 95% of the time. The other 5%, an nginx worker thread segfaults either after a websocket disconnects after publishing some messages, or right after publishing a message (and before echoing it). Again, the behavior seems to depend on the length and timing of the messages and the pattern changes when nginx is restarted.

I'll do some more testing later and try to find a simple reproducible config scenario. The scenario I was going to post stopped being reproducible (after some nginx restarts) as I was writing this...

For reference, I'm using nchan v1.2.7 from the 2020-03-17 release with nginx/1.19.3.

sobitcorp avatar Oct 25 '20 01:10 sobitcorp

This unfortunately seems to be broken still. Segfault about 10% of the time on websocket disconnect... and then everything is broken.

morebrackets avatar Jan 06 '22 06:01 morebrackets