igmpproxy icon indicating copy to clipboard operation
igmpproxy copied to clipboard

igmpproxy is broken if defaultdown is not set

Open cbean opened this issue 9 years ago • 41 comments

See comment blow.

cbean avatar May 24 '16 19:05 cbean

Which OS are you using?

ViToni avatar May 24 '16 20:05 ViToni

I am using gentoo-linux.

cbean avatar May 24 '16 20:05 cbean

Could you please post your complete igmpproxy.conf

ViToni avatar May 25 '16 12:05 ViToni

Attached is the igmpproxy.conf igmpproxy.conf.txt

cbean avatar May 25 '16 12:05 cbean

I guess the reason you see this is the: defaultdown option your are using.

As I understand the code it means that the interfaces found are configured per default as downstream...

You could try to:

  1. not use this option (what's the reason you used it in first place?)
  2. disable the interfaces you don't need/want manually via: phyint <your_device_name_here> disabled (as in https://github.com/pali/igmpproxy/blob/next/igmpproxy.conf#L44)

(BTW, is the config for Telekom Entertain? VLAN8 & altnet addresses seem to match)

ViToni avatar May 25 '16 15:05 ViToni

If I remove or do not set defaultdown (means disabled), it stops working at all and I get this:

igmpproxy-a599e516fccf617e9be0a0eb5daed4ebec8f5a50-defaultdown-notset.txt

If I enable defaultdown, it is working and I get this:

igmpproxy-a599e516fccf617e9be0a0eb5daed4ebec8f5a50-defaultdown-enabled.txt

I guess I do a patch to revert the option and check if it is working? What do you think? I will test this, since defaultdown really adds all interfaces.

Reply-to-BTW: Yes this it German Telekom Entertain TV. I am using it for almost 7 or 8 yrs now with gentoo and igmpproxy and I appreciate the ongoing development of igmpproxy, THANKS!!!)

cbean avatar May 25 '16 16:05 cbean

Would you be so kind to add the logs as files, the new lines are broken making it difficult to read.

Please also try to use phyint <your_device_name_here> disabled especially for pppoe

ViToni avatar May 25 '16 17:05 ViToni

Okay, will give you feedback soon.

While reverting the patch I didn't find in igmpproxy.h the following:

int getMcGroupSock(void);

Isn't it applied?

cbean avatar May 25 '16 17:05 cbean

You could use the search function, it is here: https://github.com/pali/igmpproxy/blob/next/src/igmpproxy.h#L268 https://github.com/pali/igmpproxy/blob/next/src/rttable.c#L80

Which version are you using?

ViToni avatar May 25 '16 18:05 ViToni

I used defaultdown only, cause it isn't working at all without that option.

I use pending. It doesn't make a difference whether you use pending or next, it is only working with defaultdown option in igmpproxy.config. Otherwise it is stuck. (Btw, the pending doesn't have GroupSock anymore.)

Also, I created a patch to revert the detection of the downstream interface on the fly and it works as expected. So, the reason why it is not working without defaultdown is the commit 86c3637bfe94bbbd00b1b61fb7c29c125ac69d94 It seems, there is a bug if you don't use defaultdown.

161_minor_spelling_fixes.patch.txt 162_dont_detach.patch.txt 200_remove_duplicate_code_in_ifvc.c.patch.txt 400_revert_downstream_link_change_detect_on_the_fly.patch.txt

The patches apply to commit a599e516fccf617e9be0a0eb5daed4ebec8f5a50

cbean avatar May 25 '16 20:05 cbean

Here are some logs, all with commit a599e516fccf617e9be0a0eb5daed4ebec8f5a50 and my previous patches 161, 162 and 200 (these are easy changes, so not responsible for bugs):

igmpproxy-a599e516fccf617e9be0a0eb5daed4ebec8f5a50-defaultdown-enabled.txt

igmpproxy-a599e516fccf617e9be0a0eb5daed4ebec8f5a50-defaultdown-notset.txt

And here is the log with additional patch 400 from above (means detect downstream on the fly reverted - commit 86c3637bfe94bbbd00b1b61fb7c29c125ac69d94):

igmpproxy-a599e516fccf617e9be0a0eb5daed4ebec8f5a50-with-default-downstream-detect-on-the-fly-reverted.txt

Remark: While using patch 400 which reverts detection on the fly, messages in the firewall don't appear anymore.

For the record: commit 86c3637bfe94bbbd00b1b61fb7c29c125ac69d94 breaks functionality if you do not set defaultdown.

Please, let me know if you need any further testing. You're welcome.

cbean avatar May 25 '16 20:05 cbean

While using defaultdown, phyint xxx disabled isn't working.

BTW: The use of phyint xxx disabled isn't an option since I also have L2TP PPP clients which may or may not have connected.

cbean avatar May 25 '16 20:05 cbean

Could you be clearer about the expected behaviour? The title says "igmpproxy is sending igmp Proto 2 messages to all interfaces". This is what defaultdown is supposed to do. What problem are your encountering exactly and what do you expect to happen instead.

IMO should defaultdown and phyint xxx disabled should work together, if they don't this is an issue of its own. There are some devices you really don't want to send multicast to, eg. pppoe (as in igmpproxy-a599e516fccf617e9be0a0eb5daed4ebec8f5a50-defaultdown-enabled.txt)

ViToni avatar May 26 '16 12:05 ViToni

igmpproxy should work without defaultdown

Bug / Problem:

If defaultdown isn't set, igmpproxy isn't working at all.

Epected behaviour:

If defaultdown isn't set, igmpproxy should use default state of interface disabled and add only the (one) interface(s) which is set in igmpproxy.conf as downstream interface(s) manually.

Why the option defaultdown in combination with phyint xxx disabled in some circumstances isn't an option:

If you use defaultdown multicast is sent by default to all interfaces, also pppl2tp-interfaces which mostly connect via WLAN (wirelessly) and don't have the bandwidth to carry heavy multicast like IPTV. The scenario to disable pppl2tp-interfaces isn't practical since you would have to know the number N in pppN-interface-names which is randomly chosen by the ppp-daemon. So there is a need to keep default state disabled and add downstream interfaces manually, which is the expected behavior for defaultdown unset.

To Do:

Make igmpproxy work with default state disabled for all existing and added interfaces if defaultdown isn't set in igmpproxy.conf.

(Sorry for unclear statements before.)

cbean avatar May 26 '16 12:05 cbean

Could you please test, if this works for you: https://github.com/ViToni/igmpproxy/tree/getifaddrs

ViToni avatar Jun 01 '16 18:06 ViToni

In response to your version: segfault - see attached logs.

igmpproxy_&&_kern.log_concated.txt

cbean avatar Jun 02 '16 09:06 cbean

Could you give this one a try: https://github.com/ViToni/igmpproxy/tree/issue_12 with ./igmpproxy -d -vvvvv /tmp/igmpproxy.conf

ViToni avatar Jun 12 '16 09:06 ViToni

You are always welcome. See the result:

igmpproxylog.txt

Jun 12 17:16:35 pluto kernel: [90341.627869] igmpproxy[14654]: segfault at 4 ip 0000006af0c933f6 sp 000003a8c3ece3f0 error 4 in igmpproxy[6af0c8b000+c000]

cbean avatar Jun 12 '16 15:06 cbean

Do you have devices without an assigned IP? I suspect the logging to not handle this case correctly. I updated the tree: https://github.com/ViToni/igmpproxy/tree/issue_12

ViToni avatar Jun 12 '16 21:06 ViToni

Yes, I have devices without an assigned IP.

And good news: it works!

/usr/sbin/igmpproxy -d -vvvvv /etc/igmpproxy.conf

adding VIF, Ix 0 Fl 0x0 IP 0x011410ac br20, Threshold: 1, Ratelimit: 0 adding VIF, Ix 1 Fl 0x0 IP 0xdc63e80a wan0.8, Threshold: 1, Ratelimit: 0 joinMcGroup: 224.0.0.2 on br20 joinMcGroup: 224.0.0.22 on br20 RECV Membership query from 172.16.20.1 to 224.0.0.1 RECV V2 member report from 172.16.20.1 to 224.0.0.22 The IGMP message was from myself. Ignoring. RECV V2 member report from 172.16.20.1 to 224.0.0.2 The IGMP message was from myself. Ignoring. The IGMP message was local multicast. Ignoring. RECV Membership query from 10.232.127.254 to 224.0.0.1 RECV V2 member report from 172.16.20.77 to 239.35.10.1 Inserted route table entry for 239.35.10.1 on VIF #0 joinMcGroup: 239.35.10.1 on wan0.8 RECV V2 member report from 172.16.20.77 to 239.35.100.9 Inserted route table entry for 239.35.100.9 on VIF #0 joinMcGroup: 239.35.100.9 on wan0.8 The IGMP message was local multicast. Ignoring. Adding MFC: 193.158.35.251 -> 239.35.10.1, InpVIf: 1 The IGMP message was local multicast. Ignoring. RECV Membership query from 172.16.1.254 to 224.0.0.1 RECV Membership query from 172.16.1.254 to 224.0.0.1 RECV Membership query from 172.16.17.254 to 224.0.0.1 RECV Membership query from 172.16.17.254 to 224.0.0.1 RECV Membership query from 172.16.19.254 to 224.0.0.1 RECV Membership query from 172.16.19.254 to 224.0.0.1 RECV Membership query from 172.16.21.254 to 224.0.0.1 RECV Membership query from 172.16.21.254 to 224.0.0.1 RECV Membership query from 172.16.22.254 to 224.0.0.1 RECV Membership query from 172.16.22.254 to 224.0.0.1 The IGMP message was local multicast. Ignoring. RECV Membership query from 172.16.20.1 to 224.0.0.1 The IGMP message was local multicast. Ignoring. RECV V2 member report from 172.16.20.1 to 224.0.0.2 The IGMP message was from myself. Ignoring. RECV V2 member report from 172.16.20.77 to 239.35.10.1 Updated route entry for 239.35.10.1 on VIF #0 Adding MFC: 193.158.35.251 -> 239.35.10.1, InpVIf: 1 RECV V2 member report from 172.16.20.77 to 239.35.100.9 Updated route entry for 239.35.100.9 on VIF #0 The IGMP message was local multicast. Ignoring. RECV V2 member report from 172.16.20.1 to 224.0.0.22 The IGMP message was from myself. Ignoring. Adding MFC: 193.158.35.160 -> 239.35.100.9, InpVIf: 1 The IGMP message was local multicast. Ignoring.

cbean avatar Jun 12 '16 22:06 cbean

Okay, still issues after sending dhcprequests on upstream interface:

Jun 13 02:57:45 pluto dhclient: DHCPREQUEST on wan0.8 to 193.158.137.14 port 67 Jun 13 02:57:45 pluto dhclient: DHCPACK from 193.158.137.14 Jun 13 02:57:46 pluto igmpproxy[29439]: select() failure; Errno(4): Interrupted system call Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br21 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br22 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping wan0.8 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping pppoe Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping lo Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping lan0 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping wan0 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping vlan1 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping vlan17 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping vlan18 Jun 13 02:57:46 pluto /etc/init.d/igmpproxy[5888]: start-stop-daemon: caught an interrupt Jun 13 02:57:46 pluto /etc/init.d/igmpproxy[5888]: start-stop-daemon: /usr/sbin/igmpproxy died Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping vlan19 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping vlan20 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping vlan21 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping vlan22 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br1 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br17 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br18 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br19 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br20 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br21 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping br22 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping wan0.7 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping wan0.8 Jun 13 02:57:46 pluto igmpproxy[5889]: buildIfVc: Too many interfaces, skipping wifi0 Jun 13 02:57:46 pluto igmpproxy[5889]: There must be at least 2 Vif's where one is upstream. Jun 13 02:57:46 pluto /etc/init.d/igmpproxy[5852]: ERROR: igmpproxy failed to start Jun 13 02:57:46 pluto dhclient: bound to 10.232.99.220 -- renewal in 42988 seconds. Jun 13 02:57:52 pluto hostapd: wifi0: STA 34:e2:fd:59:a6:94 WPA: group key handshake completed (RSN) Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br21 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br22 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping wan0.8 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping pppoe Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping lo Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping lan0 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping wan0 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping vlan1 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping vlan17 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping vlan18 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping vlan19 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping vlan20 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping vlan21 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping vlan22 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br1 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br17 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br18 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br19 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br20 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br21 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping br22 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping wan0.7 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping wan0.8 Jun 13 03:00:01 pluto igmpproxy[6115]: buildIfVc: Too many interfaces, skipping wifi0 Jun 13 03:00:01 pluto igmpproxy[6115]: There must be at least 2 Vif's where one is upstream. Jun 13 03:00:01 pluto /etc/init.d/igmpproxy[6114]: start-stop-daemon: /usr/sbin/igmpproxy died Jun 13 03:00:01 pluto /etc/init.d/igmpproxy[6114]: start-stop-daemon: caught an interrupt Jun 13 03:00:01 pluto /etc/init.d/igmpproxy[6062]: ERROR: igmpproxy failed to start

cbean avatar Jun 13 '16 01:06 cbean

Do you need the dynamic reload mechanism? The original patch for this functionality included multiple changes which seem to introduce some phenomena...

ViToni avatar Jun 13 '16 08:06 ViToni

Is the dynamic reload mechanism broken? YES Is igmpproxy working flawlessly without commit 86c3637bfe94bbbd00b1b61fb7c29c125ac69d94? YES (see patch 400 in previous comment.) Is there a possibility to disable dynamic reloading? NO Do I need the dynamic reload mechanism? NO

I would fix the issue myself if I could. I will help testing code, if required.

My guess to the above issue: igmpproxy is dropping interfaces since there is a maximum count of interfaces reached and as a result it is dropping interfaces - in this case the first detected interfaces which are the upstream and downstream set in the config. It seems igmpproxy does not forget old unused interfaces and reaches the number of maximum interfaces, which is set in:

igmpproxy-issue_12/src/igmpproxy.h:#define MAX_IF 40 // max. num

So probably igmpproxy need also to probe and check for unused interfaces and remove them from the list. Of course, it should not remove the interfaces set in the config. (sorry for my English, I am not a native speaker)

cbean avatar Jun 13 '16 19:06 cbean

Please try this branch: https://github.com/ViToni/igmpproxy/tree/logging I reworked the buildIfvc()/rebuildIfVC() to something hopefully more useful for many interfaces.

The limitations is mostly due to specific OS, see here: http://lxr.free-electrons.com/source/include/uapi/linux/mroute.h#L35 Only 32 VIF are supported

ViToni avatar Jun 13 '16 22:06 ViToni

To explain my network setup:

The machine has physical interfaces:

wan0, lan0 ( , lan1, lan2, lan3, lan4 - those are unused and don't have an IP set) wifi0 is the wireless interface, running hostapd and added to bridge br17.

wan0 is configured without IP and spans vlanid 7 & 8: wan0.7 (PPPoE) and wan0.8 (IPTV)

lan0 spans vlanids without ips: 1,17,18,19,20,21,22 on each vlan, there is a bridge with an ip i.e. br1 == 172.16.1.1/24, br17 == 172.16.17.1/24, br18=172.16.18.1/24 and so on.

The IP .254 is a cisco Layer2 switch. Which does IGMP-snooping.

See ifconfig.

cbean avatar Jun 14 '16 00:06 cbean

Wow, you changed a lot of code and it is logging like a pasting machine 👍

So far, it is running for almost 1 hour... still can't say, if it fixed. The previous version was running 3 hrs till the issue came up. Let's wait and see...

cbean avatar Jun 14 '16 01:06 cbean

Fix confirmed! Still working without any issue. Good job. 👍 /proc/net/ip_mr_cache/~vif

cbean avatar Jun 14 '16 17:06 cbean

Thanks!

Therewas still some code which had to be changed for the dynamical rebuild of VIFs/IFs.

Could you please test again to be sure there are no regressions: https://github.com/ViToni/igmpproxy/tree/logging

ViToni avatar Jun 15 '16 22:06 ViToni

Working. commit 7d34b6b25183ce363f15b1de4af45dd90c7326d6

igmpproxylog.txt

`cat /proc/net/ip_mr_cache /proc/net/ip_mr_vif

Group Origin Iif Pkts Bytes Wrong Oifs 096423EF 3E239EC1 1 2 520 0 0:1 096423EF BE239EC1 1 6 1560 0 0:1 096423EF A0239EC1 1 4 1040 0 0:1 096423EF F1239EC1 1 4 1040 0 0:1 1B2823EF FB239EC1 1 160709 217732760 0 0:1 FAFFFFEF 4D1410AC -1 0 0 0

Interface BytesIn PktsIn BytesOut PktsOut Flags Local Remote 0 br20 0 0 217736920 160725 00000 011410AC 00000000 1 wan0.8 217736920 160725 0 0 00000 DC63E80A 00000000`

Will still have to check a few hours, to make things sure.

cbean avatar Jun 16 '16 04:06 cbean

So far so good, running perfectly.

One question:

#rescanvif

isn't set in igmpproxy.conf.

Does igmpproxy still scan new and vanished interfaces, or is it static?

Sideeffect:

It seems with the version "with dynamic detect on the fly" the machine is about 7-8 °C more hot. Average temp is ~60 °C instead of 52 °C. I have no idea, where the heat comes from since CPU usage is below 2%. If I use the igmpproxy version with my dynamic detect reverted, then the machine is about 7-8 °C degree cooler. I have no idea, why.

cbean avatar Jun 16 '16 08:06 cbean