rp2040-dmxsun icon indicating copy to clipboard operation
rp2040-dmxsun copied to clipboard

Issues with W5500 Branch

Open BastelPichi opened this issue 1 year ago • 13 comments

First off, thanks for this awesome project! This project has really gone a long way and looks very promising.

USB works fine, full 16 universes no problemo. However, when using the W5500 branch from your repo, Im not very successful. Ive wired up the W5500, and Ive confirmed it works with another project: mongoose.

The code always gets stuck here: image

The "Link UP detected" also doesnt work, when unplugging and plugging the cable back in. It doesnt really matter if I boot with, or without the LAN cable plugged in.

The module is connected via a switch to my PC and router. When unplugging the uplink from the switch, and capturing traffic with wireshark, and starting the Pico with W5500, nothing really happens. However the Link LEDs do blink periodically.

Ive tried DHCP both with, and without fallback, and neither works.

Do you have any idea? Am I missing something? This happens both with GH Actions build and a build on my PC.

BastelPichi avatar Oct 16 '24 14:10 BastelPichi

Ive made progress; having issues with DHCP now, after the server offers an DHCP lease, the pico just stops responding, and sends a Discover Frame a few seconds later.

BastelPichi avatar Oct 22 '24 21:10 BastelPichi

Hey @BastelPichi , indeed, the W5500 branch is very experimental and not working as hoped, yet. I started work on it back in the days but meanwhile I got the W5500 working in another project. I need to backport the changes done there to the branch here sometime. You mean the DHCP client of the dmxsun is not working / correctly parsing the DHCP server's OFFER?

kripton avatar Oct 26 '24 21:10 kripton

You mean the DHCP client of the dmxsun is not working / correctly parsing the DHCP server's OFFER?

Exactly. The Pico receives an offer, then nothing happens. It keeps retrying, USB keeps working. Also not having any success with static IP (Neither ping nor receiving artnet). As far as I have seen, shouldnt all of the DHCP Client stuff done by lwip directly?

I was looking at the Wiznet Library for their Pico W5500 board. If you change the pin config its compatible with the current pin layout, and that might be more future-proof than using the 5 year old W5500MacRaw library.

The inital issue I had with the WIZOK:1, was solved by adding a small delay at startup. Not sure exactly why, but it solved the issue.

Im interested in contributing to this awesome project, is there any chance I can contact you on Discord or IRC?

BastelPichi avatar Oct 27 '24 16:10 BastelPichi

image image image

For reference.

BastelPichi avatar Oct 27 '24 19:10 BastelPichi

This is getting interesting. You found the branch in my personal fork but reported the issue here. Nice :) Took me some time to realize.

Can you try setting LWIP_NETIF_HOSTNAME to 0 in https://github.com/kripton/rp2040-dmxsun/blob/ethernet_w5500/src/lwipopts.h#L48? If something leads to compilation failures, just surround it with an #if LWIP_NETIF_HOSTNAME

I first thought this could be related to https://github.com/hathach/tinyusb/pull/1712 but that was on the DHCP server side, not the client.

Cool that you want to contribute! It's been a long time that I was on IRC the last time but I could start that again. Discord also would be an option. In any case, you can also send me an email. That would not require setting up a time slot ;)

kripton avatar Nov 02 '24 06:11 kripton

Can you try setting LWIP_NETIF_HOSTNAME to 0 in https://github.com/kripton/rp2040-dmxsun/blob/ethernet_w5500/src/lwipopts.h#L48? If something leads to compilation failures, just surround it with an #if LWIP_NETIF_HOSTNAME

That doesnt change anything. Ive enabled logging:

dhcp_timeout(): restarting discovery
dhcp_discover()
transaction id xid(775b85a4)
dhcp_discover: making request
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
dhcp_discover: deleting()
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 8000 msecs

That would explain that. Communication between the module and the Pico defently works (it sends out the broadcast frames afterall).

BastelPichi avatar Nov 02 '24 18:11 BastelPichi

Are you using the boards from the "hardware" folder or do you custom hardware / wiring? Since you said static IP is not working as well: To me it looks like the IRQ line is not working properly. Other projects/libs using the W5500 can do without (Adafruit's library for example for example). Initialization will work fine without the IRQ and probably packet sending as well, however packet reception WILL fail. Looks like what you are experiencing.

In case you are using the baseboard from the hardware folder, there is a solder jumper (JP1) close to the W5500 that selects if the IRQ line (there is only one to the Pico since I ran out of pins) is routed to the nRF24 or the W5500. If you do not solder that to the W5500, it might cause such issues.

The mongoose projects looks very interesting! I was aware that it exists as an webserver-as-a-library but I was not aware that they provide a complete TCP/UDP stack including a W5500 driver. It might be worth investigating if it can replace lwIP completely. Benefits would be a webserver with a less-limited API and with websocket support. However, it would need to handle multiple network interfaces (USB to host, Ethernet via W5500 plus optional WiFi on the Pico W) and we would need to replaces lwIP's fsdata approach.

kripton avatar Nov 03 '24 09:11 kripton

Are you using the boards from the "hardware" folder or do you custom hardware / wiring?

Im using my custom wiring for now.

Initialization will work fine without the IRQ and probably packet sending as well, however packet reception WILL fail. Looks like what you are experiencing.

The IRQ Callback does get called. image

However, we only get like 1-2 IRQs after startup, then it stops completely. Hooking up an Logic Analyser, I cant even see those two frames. So maybe something is going wrong while enabling the irq interrupts?

BastelPichi avatar Nov 03 '24 10:11 BastelPichi

Root cause found: eth_w5500 defines a method called service_traffic but it is never called. Add it to webserver.cpp in the cyclicTask() method. On my desk, it fixes the DHCP client. However, the W5500 link detection behaves flaky and it reports a missing and then a re-established link VERY fast. Not stable here, yet :/

kripton avatar Nov 04 '24 00:11 kripton

I did some more playing around yesterday. The wireless code needs to be disabled as well since they share the SPI bus but the W5500 and the nRF are controlled by different cores. Nevertheless, something seems very flaky and unstable. Code pushed to the ethernet_W5500 branch. I also rebased that on top of latest OpenLightingProject/rp2040-dmxsun/main

kripton avatar Nov 04 '24 12:11 kripton

I should have the wireless (nrf) disabled, im not really interested in that feature anyways. (Wifi would be nice tho, dont have a Pico W here yet)

The link detection didnt really work at all for me. I can try your changes.

However I do need to look into the interrupts, from my understanding the handletraffic only reads out the traffic manually after a 100msec timeout.

Also dont we need to use the MacRaw branch with Pico support? I dont see switching to that branch in your git submodules.

BastelPichi avatar Nov 04 '24 12:11 BastelPichi

While the DHCP now completes, lwip now completely goes nuts. It constantly tries to get a new IP, even despite successfully getting one. This completely clogs up the pico and not even USB Network is working anymore. image

BastelPichi avatar Nov 06 '24 20:11 BastelPichi

Yeah, that behaviour is exactly what I meant with "However, the W5500 link detection behaves flaky and it reports a missing and then a re-established link VERY fast.". The W5500 reports an Ethernet link -> DHCP client is started and sends a DISOVER. Then, W5500 reports no link -> if goes down. The, W5500 reports a link again and this leads to the huge amount of DISCOVER packages you see there. You can watch the whole process in the logs on the emulated serial port if you do not want to rely on (emulated) Ethernet to fetch them.

kripton avatar Nov 10 '24 21:11 kripton