pico-sdk icon indicating copy to clipboard operation
pico-sdk copied to clipboard

No way to support custom LWIP implementations when NO_LWIP enabled

Open earlephilhower opened this issue 2 years ago • 5 comments

The Pico SDK gives a good, simple set up for LWIP use, but it combines the WiFi state control and the TCP stack setup in cyw43_lwip https://github.com/georgerobotics/cyw43-driver/blob/195dfcc10bb6f379e3dea45147590db2203d3c7b/src/cyw43_lwip.c and in the SDK proper in pico_cyw43_arch.

Unfortunately, I think this makes it impossible, without duplicating large portions of the CYW43 support in the SDK, to do more advanced setup (i.e. configure the AP to other addresses, start with a static IP address, use IPV6) and use.

The SDK can be configured with CYW43_LWIP=0 while still enabling the CYW43 driver (I assume mainly for LED support?) and all the WiFi link-level support, but any use of LWIP at the app-level is denied because there are predefined callbacks the CYW43 driver calls that are hard-coded to panic the machine: https://github.com/raspberrypi/pico-sdk/blob/2e6142b15b8a75c1227dd3edbe839193b2bf9041/src/rp2_common/pico_cyw43_arch/cyw43_arch_threadsafe_background.c#L300-L317

If these panic CBs were made weak, then the default behavior would still happen (as an aside, does it make sense to crash w/apanicin runtime on the device vs. just failing to link at build time?) but a user application could but an application could override them and do their own LWIP from-the-ground-up implementation.

The alternative now would be to build with CYW43_LWIP=0 but not use anything from pico_cyw43_arch and delete the fake CBs in the SDK .c library and implement your own cyw43_cb_process_ethernet, etc.

earlephilhower avatar Jul 09 '22 03:07 earlephilhower

#960 is related to this. Multicast packers are filtered in the CYW43 driver, making IPv6 communications impossible.

earlephilhower avatar Aug 05 '22 22:08 earlephilhower

I love my Raspberry Pi, and Pico's and I just got my hands on 5 Pico W's. I was extremely disappointed in the lack of documentation and examples for C/C++ of the SDK for Pico W. The "how to" official Raspberry Pico W guide on getting connected spends just a few pages on C/C++ before quickly moving on to python which takes up most of the document with more examples etc...

I understand for many Python is the language of choice. However Raspberry shouldn't leave those who choose C/C++ as 2nd class customers. We deserve the same level of attention and examples Python gets. I apologize if this is not the correct place for this to be brought up but I'm not sure how else to get feedback to Raspberry that will actually be seen/read.

You have nice pictures and examples for wiring an LED and seeing it on a webpage for Python, all missing from the C portion. Why? I hope Raspberry as the Pico W's continue to take off in popularity really steps up their game with more detailed examples, (commented code) and a robust SDK. I know ultimately many developers will do this on their own and share it, but I feel Raspberry owes it to the C community to not have to stumble so hard keeping in mind not all C programmers are experts such as myself.

Programming The Raspberry Pi Pico In C - by Fairhead, Harry - this book - was a godsend.

It got me up and running quickly versus what Raspberry offers. You have example projects with code and of course the SDK PDF, but a few more in-depth examples wouldn't hurt (especially in the SDK PDF) and sometimes they go a long way for people who are not C experts. I'm not giving up on C or the Pico W, I know more will come. I hope it gets the attention deserved.

The ESP8266 wifi module for example has tons of great information both in C and in python. I own some of these ESP8266's and My main attraction to the Pico W was it was built in and I was expecting a robust SDK. I hope you'll prove me right.

ProTip for Raspberry or anyone extremely skilled in C, if you write a book similar to the one I mentioned and go even deeper you will make a fortune!

I mean no disrespect and I am eager to see future documentation/SDK/code examples that are yet to come! C is not dead! :)

anglerfish27 avatar Oct 07 '22 19:10 anglerfish27

@anglerfish27 you will only annoy people putting random comments on random bug reports. You'll achieve more by commenting on the forum. Have you looked at the examples in the pico-examples repo? What's missing? Post your feedback in the forum

peterharperuk avatar Oct 07 '22 20:10 peterharperuk

I'm not sure how else to get feedback to Raspberry that will actually be seen/read.

If you want to provide actual feedback, rather than just moaning, you can do so at https://github.com/raspberrypi/pico-feedback

The reason there's so much more hand-holding in the getting-started guide for Python than there is for C, is because we've naturally assumed that more beginners will choose to use Python than use C.

As Peter mentions, there are additional C examples for Pico W in https://github.com/raspberrypi/pico-examples/tree/develop/pico_w

lurch avatar Oct 07 '22 20:10 lurch

@anglerfish27 I will remove your comment soon, since it has nothing to do with this issue... so please copy/paste it if you wish to submit it to pico-feedback

kilograham avatar Oct 08 '22 14:10 kilograham

I've had some serious user hours running the PicoW and a some painful debugging of my Arduino core code to get things stable and come up with several issues related to this LWIP/CYW43 stuff. If it would be better to close this issue and open a new one, please let me know. Also, again, I'm not trying to present myself as an LWIP expert...but I have seen a lot of the LWIP code from within GDB at this point. 😆

Weak calls for the cyw43_cb callbacks

This is needed to allow separation of WiFi physical link and the TCP/IP link state. The original PR (closed now) seems to meet this specific requirement. https://github.com/raspberrypi/pico-sdk/blob/2e6142b15b8a75c1227dd3edbe839193b2bf9041/src/rp2_common/pico_cyw43_arch/cyw43_arch_threadsafe_background.c#L300-L317

"Background" LWIP mode is dangerous and should be deprecated/removed or all LWIP calls should be wrapped

LWIP is not multi-thread safe/re-entrant in the NO_SYS mode, at all. See https://www.nongnu.org/lwip/2_0_x/pitfalls.html . But, the CYW43 wrapper is calling sys_check_timeouts from an alarm/IRQ. 99% of the time this is fine in a lightly loaded system. But under load you may end up inside LWIP code when the IRQ fires and sys_check_timeouts ends up modifying LWIP state. When that happens you will end up crashing as things become inconsistent internally. I've also seen infinite loops inside LWIP because of this, so even though there is no crash there is a hang.

Also, the CYW43 driver https://github.com/georgerobotics/cyw43-driver/blob/195dfcc10bb6f379e3dea45147590db2203d3c7b/src/cyw43_lwip.c#L219 is also using an LWIP netif->input() call from within an IRQ callback from the WiFi chip. The same issue happens at this point because LWIP isn't re-entrant in NO_SYS. You could be within an LWIP block and hit this line and the LWIP state will become borked.

A potential (extreme) solution to this would be to use the linker wrap -Wl--wrap=... option to wrap all LWIP calls and set a semaphore so that the sys_check_timeouts and the netif->input() call can be skipped if the IRQ happens while w/in LWIP code. My own code ended up doing this with simple wrappers and either skipping the sys_check_timeouts call and the netif->input calls (dropping the packet) while inside an LWIP call.

The background mode is really nice from a user perspective and would be great to keep, IMHO. But it's just not stable under continuous load w/o some work in my experience.

I think SYS_ARCH_PROTECT/UNPROTECT should be defined for background (NO_SYS) mode

This is more of a pico-examples thing, but the LWIP memory management routines are not re-entrant (see above). pbuf_xxxx is legal to call from within an interrupt context, but it is not interrupt safe either. So, either the pbuf_xxx calls need to be protected in the same way as other upper-LWIP layers, or interrupts need to be disabled for safety. For example, the CYW43 Ethernet driver could allocate a PBUF to store the received packed from within an IRQ even if it couldn't call netif->input() due to the block described above (i.e. the LWIP wrappers could netif->input() the PBUF at some safe point later).

In my own experience the above changes moved my CYW43/LWIP from crashing after 10s of minutes and 10,000s of GET requests to running for > 24hrs and 2 million GETs w/o a crash or hang while in NO_SYS mode.

earlephilhower avatar Oct 26 '22 01:10 earlephilhower

Thanks for the detailed report;

LWIP is not multi-thread safe/re-entrant in the NO_SYS mode, at all. See https://www.nongnu.org/lwip/2_0_x/pitfalls.html . But, the CYW43 wrapper is calling sys_check_timeouts from an alarm/IRQ

This is a bug which i just noticed myself this week.

kilograham avatar Oct 26 '22 04:10 kilograham

^ but yes, please separate the above info into a new issue

kilograham avatar Oct 26 '22 04:10 kilograham

I've hidden the comment here and moved to #1079. The weak callbacks are still needed and covered w/the original notes.

earlephilhower avatar Oct 26 '22 19:10 earlephilhower

fixed by #1177

kilograham avatar Jan 20 '23 21:01 kilograham

#1177 merged into develop

kilograham avatar Jan 24 '23 18:01 kilograham