CereLink icon indicating copy to clipboard operation
CereLink copied to clipboard

How to connect to multiple NSPs (Linux)

Open zeeMonkeez opened this issue 8 years ago • 17 comments

I have a Linux machine with two network cards dedicated to one NSP each. eth0 has its IP set to 192.168.137.1, and I can connect to an NSP using the example code in testcbsdk.cpp. eth2 is set to 192.168.137.17 and connected to another NSP. However, I don't seem to be able to connect with cbSDK. Here is what I've tried:

  • cbSdkOpen(inst, conType) with inst=1. This seems to connect to the NSP on eth0.
  • cbSdkOpen(0, conType); cbSdkOpen(1, conType); The second call returns -8.
  • Give specific IP address:
    cbSdkOpen(0, conType);
    cbSdkConnection con;
    con.szInIP = "192.168.137.17";
    cbSdkOpen(1, conType, con);

The second call returns -30.

If this is documented somewhere, I missed it..

zeeMonkeez avatar Mar 15 '17 22:03 zeeMonkeez

What happens if you specify con for both? Also what is conType set to? Can you set them both to CBSDKCONNECTION_UDP.

`-8` is `CBSDKRESULT_ERROPENUDP` // Unable to open UDP interface (might happen if default)  

dashesy avatar Mar 16 '17 04:03 dashesy

conType was CBSDKCONNECTION_DEFAULT, but CBSDKCONNECTION_UDP does not make a difference. If I set con for the first instance as well (with con.szInIP = "192.168.137.1";), that request times out as well.

I can see cbPKT_SYSINFO packets setting runlevel 50 being sent (and acknowledged), and the NSP sends its configuration, but cbSDK does not seem to receive these packets. Packets sent from 192.168.137.17 are also routed through the same physical interface as those from 192.168.137.1 (eth0), and thus reach the same NSP.

I suppose there is a reason why on Linux by default the first socket is bound to 192.168.137.255.

BTW, I created a Wireshark dissector for NSP packets, it might be of use for some...

zeeMonkeez avatar Mar 16 '17 05:03 zeeMonkeez

By default it binds to broadcast address. What is your ifconfig? Are .1 and .17 on separate netmasks?

You can see if they bind to the port given

lsof -i :51002
lsof -i :51001

Think of it this way. cbsdk binds to an address, if that address is uniquely accessible through eth0 it means it binds to eth0.

dashesy avatar Mar 16 '17 05:03 dashesy

Sorry for butting in, especially since Linux networking is well outside my comfort zone, but is there any way to change the static ip of the second NSP?

cboulay avatar Mar 16 '17 05:03 cboulay

@cboulay You can ask Blackrock, or in the update package, there used to be a script/config that you could set another IP address to be used. Should be treated with care, because then you may not be able to even update it again!

dashesy avatar Mar 16 '17 05:03 dashesy

OK, I assume there is a reason that under Linux it binds to .255, but on Windows it binds to .1. However, if it has to bind to .255 under Linux to work, then I see no way how the kernel could possibly know how to route outgoing packets – it has to pick the default interface.

FWIW, the relevant lines of ip addr are

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.137.1/24 brd 192.168.137.255 scope global eth0
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.137.17/24 brd 192.168.137.255 scope global eth2
       valid_lft forever preferred_lft forever

Also

> ip route get to 192.168.137.128 from 192.168.137.1
192.168.137.128 from 192.168.137.1 dev eth0
    cache
> ip route get to 192.168.137.128 from 192.168.137.17
192.168.137.128 from 192.168.137.17 dev eth0
    cache

I don't seem to get open files with port 51001, only

COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
tcbsdk  4103 rtuser    8u  IPv4  28686      0t0  UDP rtlinuxpc:51002
tcbsdk  4103 rtuser   11u  IPv4  28049      0t0  UDP rtlinuxpc:51002

but maybe I'm not fast enough.

Not sure what you mean by 'separate name netmasks'.

zeeMonkeez avatar Mar 16 '17 05:03 zeeMonkeez

Not sure what you mean by 'separate name netmasks'.

They are both /24, try adding each nic on a separate netmask

eth0

192.168.137.4
~~~Netmask: 255.255.225.15~~~
Netmask: 255.255.225.252

eth1
192.168.137.16
Netmask: 255.255.225.240


1. I hope it is valid netmask :--) otherwise should route them to a bridge (lets see if no other method works).
2. As a hack, if none works try adding a secondary IP address to eth0 and use that IP for `szInIP` it should also work because it really does not matter which IP, as long as it is bound to `eth0`.
    
    ip address add 192.168.99.37/24 dev eth0

dashesy avatar Mar 16 '17 05:03 dashesy

hm. My understanding of networking is admittedly weak, but would they not have to be on a netmask that includes the NSPs (which sit at .128)? So it'd have to be /24... if it's /30, bcast address would be .3...

I will try to change the NSP's IP and network settings, that seems to be the sane way to go on Linux...

zeeMonkeez avatar Mar 16 '17 13:03 zeeMonkeez

Just a wee update: If I set SO_BINDTODEVICE, I can get sockets to behave, and traffic goes to the right NSP (I think):

    char dev[] = "eth2";
    ret = setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE, dev, sizeof(dev));

Problem with that:

  • need su privileges
  • need to figure out interface

zeeMonkeez avatar Mar 16 '17 15:03 zeeMonkeez

It is because SO_BINDTODEVICE requires root, that I did not use it.

Here is another thing to try:

iptables -i eth0 -t nat -A PREROUTING -p udp -d 192.168.137.255 -j DNAT --to-destination 192.168.137.17

Then use 192.168.137.17

Basically I am trying to route the broadcast address (on one of the interfaces) to a fixed destination. (I like Windows semantics on broadcast better! On Windows broadcast is received on all, on Linux it is just 255)

dashesy avatar Mar 17 '17 02:03 dashesy

Also here:

ip route get to 192.168.137.128 from 192.168.137.1 192.168.137.128 from 192.168.137.1 dev eth0 cache ip route get to 192.168.137.128 from 192.168.137.17 192.168.137.128 from 192.168.137.17 dev eth0 cache

are they both eth0?

dashesy avatar Mar 17 '17 02:03 dashesy

Thanks, your iptables rule brought me on the right track. I'll have to test this more extensively, but initial tests show that packets go out to the right interfaces and are also received (This solution was adapted from an answer on ServerFault ):

Re-route incoming bcast packets on eth2 to the device IP on eth2:

iptables -i eth2 -t nat -A PREROUTING -p udp -d 192.168.137.255 -j DNAT --to-destination 192.168.137.17

Add rule to route marked packets according to table 3:

ip rule add fwmark 2 table 3

Route all traffic through eth2 in table 3:

ip route add default dev eth2 table 3
ip route flush cache

Mark packets sent to 192.168.137.129 and rewrite their destination to .128:

iptables -t mangle -A OUTPUT -p udp -d 192.168.137.129 -j MARK --set-mark 2
iptables -t nat -A OUTPUT -p udp -d 192.168.137.129 -o eth2 -j DNAT --to-destination 192.168.137.128

Haven't tested this with cbSDK yet, but this should leave the first NSP unaffected, the second one would have to be opened with a cbSdkConnection where con.szInIP = "192.168.137.17"; and con.szOutIP = "192.168.137.129";. About to test... Also, this clearly generalizes for additional NSPs.

Is there a wiki to collect such bits of info?

zeeMonkeez avatar Mar 17 '17 14:03 zeeMonkeez

Thanks for the update! Yes there is a wiki, I will copy paste to the wiki the final solution. Also I do not think you need to re-route .128 too, it is already bound to the right interface that interface will be used when sending to .128

dashesy avatar Mar 17 '17 15:03 dashesy

Not quite there yet … The connection tracker interferes (I think) … at the moment, a little bare-bones test program that opens sockets for the two NSPs and sends a few cbPKTTYPE_SYSSETRUNLEV packets to both NSPs, and receives the responses works, but only if I talk to one NSP at a time. When I talk to both, apparently the connection tracker prevents the rule that redirects incoming packets from .255 on eth2 to .17 from applying. Looking for a solution.

Also, you are right, I don't need to reroute .129 to .128 as long as I add the source IP for the 'mark' rule:

iptables -t mangle -A OUTPUT -p udp -s 192.168.137.17 -d 192.168.137.128 -j MARK --set-mark 2

zeeMonkeez avatar Mar 19 '17 02:03 zeeMonkeez

If I were you I would also look at creating a bridge for one of the NSPs, and give it a different IP address range.

dashesy avatar Mar 19 '17 03:03 dashesy

OK, I think I have a workable solution. It does involve a bit of networking foo, and I cannot claim to completely understand how the individual bits interact (especially tc, whose documentation is ... dense).

Essentially, what we have to do is the following:

  • assign a different IP (192.168.138.1) to second NIC (eth2)
  • ensure outgoing packets have their destination IP rewritten, to 192.168.137.128, but still get routed through eth2
  • do NAT for incoming (broadcast) packets to 192.168.137.255 on eth2 to 192.168.138.255. For that, we cannot use iptables, because netfilter monitors connections and would falsely assign this to traffic on eth0 and skip NATing (hence tc).

In particular, my /etc/network/interfaces contains these lines:

  up ip route add default dev eth2 table 0x12
  up ip rule add fwmark 0x11 table 0x12 pref 3
  up tc qdisc add dev eth2 ingress || true
  up tc filter add dev eth2 parent ffff: protocol ip prio 1 u32 match ip dst 192.168.137.255 action nat ingress 192.168.137.255 192.168.138.255

  down ip rule del pref 3 || true
  down ip route flush table 0x12 || true
  down tc filter del dev  eth2 parent ffff: protocol ip prio 1 u32 match ip dst 192.168.137.255 action nat ingress 192.168.137.255 192.168.138.255 || true
  down tc qdisc del dev eth2 ingress || true

The first two lines add a new routing table, where all traffic defaults to interface eth2 and all packets marked with 0x11 by the firewall get routed by that table. The next two lines set up tc to do stateless NAT for incoming packets for 192.168.137.255 on eth2 to be redirected to 192.168.138.255. The rest is just for cleaning up these rules when eth2 goes down.

Next, iptables, to deal with those outgoing packets: I made a file /etc/iptables/iptables.up.rules which performs

iptables -t mangle -A OUTPUT -s 192.168.138.1/32 -d 192.168.138.128/32 -p udp -j MARK --set-xmark 0x11/0xffffffff
iptables -t nat -A OUTPUT -s 192.168.138.1/32 -d 192.168.138.128/32 -p udp -j DNAT --to-destination 192.168.137.128

i.e. packets to the second NSP get marked 0x11 and the destination IP gets rewritten to 192.168.137.128.

Finally, I created a file at /etc/network/if-up.d/iptables:

#!/bin/sh
/sbin/iptables-restore < /etc/iptables/iptables.up.rules

(make executable!) to load the iptables rules when an interface goes up (this should be refined to only run when eth2 goes up, and an equivalent should be written that deletes these rules when eth2 goes down ...).

Sources: iptables question on ServerFault, stateless NAT with tc on unix.se.

Now, let's test the setup with code similar to

cbSdkConnectionType conType = CBSDKCONNECTION_UDP;
cbSdkConnection con;
res = cbSdkOpen(0, conType);
printf("%i\n", res);
con.szInIP =  "192.168.138.255";
con.szOutIP =  "192.168.138.128";
res = cbSdkOpen(1, conType, con);
printf("%i\n", res);

This should print 0 twice. It is crucial to monitor traffic with wireshark or a similar tool to make sure traffic goes through the correct NICs. (Although I think the return value of at least one of the calls will be <0 if traffic goes only to one NSP).

Needless to say, it'd be great if someone with two NSPs and a Linux box could test & confirm this ...

zeeMonkeez avatar Mar 20 '17 22:03 zeeMonkeez

@dashesy I'm not quite sure I understand the bridge setup in this case. Which two networks would be bridged? Just trying to understand, as networking is something I have tried to stay clear of ...

zeeMonkeez avatar Mar 21 '17 02:03 zeeMonkeez