PySyncObj icon indicating copy to clipboard operation
PySyncObj copied to clipboard

Servers with Ip address are not communicating

Open naluka1994-zz opened this issue 4 years ago • 10 comments

Lets say I have serverA with test-0.domain.com:5000 and server B with test-1.domain.com:5000 and server C with test-2.domain.com:5000

The test-0.domain.com,test-1.domain.com,test-2.domain.com addresses are same as ifconfig address on respective servers.

On server A

class testserver(SyncObj):
    def __init__(self,currentHost=None,partners=None):
        super(testserver, self).__init__('test-0.domain.com:5000', ['test-1.domain.com:5000','test-2.domain.com:5000'])
test =  testserver()
while True:
     print(test._getLeader())

On server B

class testserver(SyncObj):
    def __init__(self,currentHost=None,partners=None):
        super(testserver, self).__init__('test-1.domain.com:5000', ['test-0.domain.com:5000','test-2.domain.com:5000'])
        
test =  testserver()
while True:
     print(test._getLeader()) 

On server C

class testserver(SyncObj):
    def __init__(self,currentHost=None,partners=None):
        super(testserver, self).__init__('test-2.domain.com:5000', ['test-0.domain.com:5000','test-1.domain.com:5000'])
test =  testserver()
while True:
     print(test._getLeader())

Now I started the above code on three servers A, B, C and trying to print the leader. But it says leader is None.

@bakwc or Can someone help, on how to resolve this issue ?

naluka1994-zz avatar Jul 21 '21 01:07 naluka1994-zz

What is your network card IP address? Is it the same as IP address of a domain test-0.domain.com? What OS on your servers? Is it linux?

bakwc avatar Jul 21 '21 09:07 bakwc

I uploaded possible fix. Try to use a fresh version from github:

# 1) Remove current pip version
sudo pip uninstall pysyncobj

# 2) Clone github repo
git clone https://github.com/bakwc/PySyncObj
cd PySyncObj

# 3) Install it
sudo python setup.py install

bakwc avatar Jul 21 '21 14:07 bakwc

@bakwc Yes, the network card IP address is same as the domain. OS servers are alpine linux. Is it possible to release on PyPI. Currently I am testing the giving fix. will let you know after that.

naluka1994-zz avatar Jul 21 '21 15:07 naluka1994-zz

Yes, the network card IP address is same as the domain

Then probably fix won't work. Everything looks fine. Please try to use IP addresses instead of hostname. If won't help - you can give me access to your machines so I look myself.

bakwc avatar Jul 21 '21 16:07 bakwc

I am using IP address instead of domain name. I am getting the ip address using socket.gethostbyname(domain name). will the fix work, If I pass IP address ?

naluka1994-zz avatar Jul 21 '21 16:07 naluka1994-zz

I am running on kubernetes using statefulset. Still it's not working on giving the IP address. The servers are able to communicate each other for sure as I have tested the telnet and ping command.

naluka1994-zz avatar Jul 21 '21 18:07 naluka1994-zz

On server B: I am doing telnet command to see if they are accepting connections on server A are not.

/opt/service # busybox-extras telnet test-0.test.svc.cluster.local:5000
Connected to test-0.test.svc.cluster.local:5000:5000

On Server A:

/opt/service # netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 :::5000                 :::*                    LISTEN
tcp        0      0 :::5001                 :::*                    LISTEN
tcp        0      0 :::5002                 :::*                    LISTEN
tcp        0      0 :::80                   :::*                    LISTEN
tcp        0      0 :::443                  :::*                    LISTEN
tcp        0      0 ::ffff:172.20.3.224:5000 ::ffff:172.20.10.8:37146 ESTABLISHED
udp        0      0 127.0.0.1:38641         127.0.0.1:8125          ESTABLISHED
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path

This is how I am passing current and partner hosts.

CurrentHost Port: 172.20.3.225:5000 PartnerHost Hosts: ['172.20.10.9:5000', '172.20.5.187:5000']

I see the servers are not communication with syncObj code and always the Leader is None

@bakwc can you please look into this issue ?.

naluka1994-zz avatar Jul 22 '21 00:07 naluka1994-zz

tcp 0 0 ::ffff:172.20.3.224:5000 ::ffff:172.20.10.8:37146 ESTABLISHED This is IPv6 connection. Try to use IPv4.

bakwc avatar Jul 22 '21 00:07 bakwc

@bakwc

I am using IPv4 only. I tested out by connecting using socket programming

import socket # for socket
import sys

try:
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    print ("Socket successfully created")
except socket.error as err:
    print ("socket creation failed with error %s" %(err))

# default port for socket
port = 5000


# connecting to the server
s.connect(('test-0.test.pulsar.svc.cluster.local', port))

It is able to create the connection successfully.

Here is how I am passing current host and partner host values to syncObj class. CurrentHost Port: 172.20.3.225:5000 PartnerHost Hosts: ['172.20.10.9:5000', '172.20.5.187:5000']

But the same setup works on localhost but not on remoteIP, where it says leader is None. Can you please look into this at your earliest convenience.

naluka1994-zz avatar Jul 22 '21 00:07 naluka1994-zz

@bakwc I have logged the information of the connections on the syncObj connections. will this be helpful in debugging ?

INFO:root:CurrentHost Port: 172.20.10.36:80 PartnerHost Hosts: ['172.20.4.20:80', '172.20.5.199:80']
INFO:root:addrs : [(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('172.20.10.36', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_DGRAM: 2>, 17, '', ('172.20.10.36', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_RAW: 3>, 0, '', ('172.20.10.36', 0))]
INFO:root:ips : ['172.20.10.36']
INFO:root:self.__ip : 172.20.10.36
INFO:root:addrs : [(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('172.20.4.20', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_DGRAM: 2>, 17, '', ('172.20.4.20', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_RAW: 3>, 0, '', ('172.20.4.20', 0))]
INFO:root:ips : ['172.20.4.20']
INFO:root:self.__ip : 172.20.4.20
INFO:root:addrs : [(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('172.20.5.199', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_DGRAM: 2>, 17, '', ('172.20.5.199', 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_RAW: 3>, 0, '', ('172.20.5.199', 0))]
INFO:root:ips : ['172.20.5.199']
INFO:root:self.__ip : 172.20.5.199
INFO:root:Preferred Family: AddressFamily.AF_INET
INFO:root: SOCKET IPV4: AddressFamily.AF_INET
INFO:root:self.__hostAddrType ==> AddressFamily.AF_INET

let me know if you need any other logging information.

naluka1994-zz avatar Jul 22 '21 08:07 naluka1994-zz