gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Address already in use (bind errno 98)

Open constzl opened this issue 6 years ago • 8 comments
trafficstars

Greenplum version or build

4.3 & 6X_STABLE

OS version and uname -a

Linux x86_64 GNU/Linux

autoconf options used ( config.status --config )

Installation information ( pg_config )

Expected behavior

Actual behavior

 LOG:  (58M01) Master unable to connect to seg2 xxxx:3176 with options : FATAL:  Interconnect Error: Could not set up tcp listener socket.
DETAIL:  Address already in use (bind errno 98)

Step to reproduce the behavior

constzl avatar Oct 22 '19 12:10 constzl

I think this function setupTCPListeningSocket should set SO_REUSEADDR option with socket before "bind".

image

constzl avatar Oct 22 '19 12:10 constzl

Is that mean that the port 3176 is already in use? Can we find out who is it?

remotefish avatar Oct 23 '19 07:10 remotefish

Do we have the same problem in setupUDPListeningSocket()?

By the way, what exactly is the problem here? Is it a transient problem, does the error stop occurring after some time?

asimrp avatar Oct 23 '19 07:10 asimrp

Do we have the same problem in setupUDPListeningSocket()?

By the way, what exactly is the problem here? Is it a transient problem, does the error stop occurring after some time?

It is a problem with probability. See the function StreamServerPort to get more details.

Without the SO_REUSEADDR flag, a new postmaster can't be started
right away after a stop or crash, giving "address already in use"
error on TCP ports.

constzl avatar Oct 23 '19 10:10 constzl

Is that mean that the port 3176 is already in use? Can we find out who is it?

The socket with port 3176 is in TIME_WAIT state. When there are a lot of TCP connections generated at the same time, there will be probability of TCP port number reuse, and some of these port may be in the TIME_WAIT state. Without the SO_REUSEADDR flag, we will get "address already in use" error on TCP ports.

constzl avatar Oct 23 '19 10:10 constzl

I see, thank you for the details. UDP connections won't suffer from this issue, setupUDPListeningSocket() doesn't need any fix. Am I right?

asimrp avatar Oct 23 '19 12:10 asimrp

I see, thank you for the details. UDP connections won't suffer from this issue, setupUDPListeningSocket() doesn't need any fix. Am I right?

yeah

constzl avatar Oct 23 '19 12:10 constzl

Do we have the same problem in setupUDPListeningSocket()?

By the way, what exactly is the problem here? Is it a transient problem, does the error stop occurring after some time?

Interconnect/UDP could be even worse. All sockets for interconnect/UDP are bound to *:<port> and the sockets for interconnect/TCP are bound to a unicast IP address. It means all sockets for interconnect/UDP share the same port space.

@constzl Could you show me the content in gp_segment_configuration? select dbid, hostname, address from gp_segment_configuration

gfphoenix78 avatar Feb 26 '20 02:02 gfphoenix78