RF24
RF24 copied to clipboard
Use Linux kernel's character device API to implement IRQ capability
This guide does mention how IRQ could be done using some poll()
function (which is unknown to me), but it uses a blocking example.
The main obstacle (for me) is using a separate processor thread to poll the GPIO for IRQ edge detection.
Originally posted by @2bndy5 in https://github.com/nRF24/RF24/issues/942#issuecomment-1950890771
FYI just tested the interrupt example and pigpio doesn't seem to work on the RPi5 yet.
+---------------------------------------------------------+
|Sorry, this system does not appear to be a raspberry pi. |
|aborting. |
+---------------------------------------------------------+
Now you have me playing with pthreads again... this is going to take some time. I have a simple example triggering an IRQ using pins 22 or others that are toggled manually, but can't get it to trigger on the radio IRQ pin for some reason. Wondering if it needs to be set to an input or something first? The guide doesn't seem to do that...
Also wondering if I should continue, or if this is something you wanted to take on?
I could take a stab at it, but I'll need you to verify RPi5 compatibility.
Still researching pthread... Looks like the function passed to pthread_create()
only executes once in a separate thread -- using default scheduling and thread attrs, I think -- then the thread terminates upon exit of the passed function.
Clearly, I've never done multi-threaded processing in C++. Some of my input here might be obvious to those with experience using pthread.
Ok, I'm booted into Ubuntu and inspecting /usr/include/linux/gpio.h. And I just found out that there is a v2 API in gpio.h. Some of the structs used in #942 are actually deprecated already 😡. For example:
gpiohandle_request
is defined as
/**
* struct gpiohandle_request - Information about a GPIO handle request
* @lineoffsets: an array of desired lines, specified by offset index for the
* associated GPIO device
* @flags: desired flags for the desired GPIO lines, such as
* %GPIOHANDLE_REQUEST_OUTPUT, %GPIOHANDLE_REQUEST_ACTIVE_LOW etc, added
* together. Note that even if multiple lines are requested, the same flags
* must be applicable to all of them, if you want lines with individual
* flags set, request them one by one. It is possible to select
* a batch of input or output lines, but they must all have the same
* characteristics, i.e. all inputs or all outputs, all active low etc
* @default_values: if the %GPIOHANDLE_REQUEST_OUTPUT is set for a requested
* line, this specifies the default output value, should be 0 (low) or
* 1 (high), anything else than 0 or 1 will be interpreted as 1 (high)
* @consumer_label: a desired consumer label for the selected GPIO line(s)
* such as "my-bitbanged-relay"
* @lines: number of lines requested in this request, i.e. the number of
* valid fields in the above arrays, set to 1 to request a single line
* @fd: if successful this field will contain a valid anonymous file handle
* after a %GPIO_GET_LINEHANDLE_IOCTL operation, zero or negative value
* means error
*
* Note: This struct is part of ABI v1 and is deprecated.
* Use &struct gpio_v2_line_request instead.
*/
struct gpiohandle_request {
__u32 lineoffsets[GPIOHANDLES_MAX];
__u32 flags;
__u8 default_values[GPIOHANDLES_MAX];
char consumer_label[GPIO_MAX_NAME_SIZE];
__u32 lines;
int fd;
};
and gpio_v2_line_request
is defined as
/**
* struct gpio_v2_line_request - Information about a request for GPIO lines
* @offsets: an array of desired lines, specified by offset index for the
* associated GPIO chip
* @consumer: a desired consumer label for the selected GPIO lines such as
* "my-bitbanged-relay"
* @config: requested configuration for the lines.
* @num_lines: number of lines requested in this request, i.e. the number
* of valid fields in the %GPIO_V2_LINES_MAX sized arrays, set to 1 to
* request a single line
* @event_buffer_size: a suggested minimum number of line events that the
* kernel should buffer. This is only relevant if edge detection is
* enabled in the configuration. Note that this is only a suggested value
* and the kernel may allocate a larger buffer or cap the size of the
* buffer. If this field is zero then the buffer size defaults to a minimum
* of @num_lines * 16.
* @padding: reserved for future use and must be zero filled
* @fd: if successful this field will contain a valid anonymous file handle
* after a %GPIO_GET_LINE_IOCTL operation, zero or negative value means
* error
*/
struct gpio_v2_line_request {
__u32 offsets[GPIO_V2_LINES_MAX];
char consumer[GPIO_MAX_NAME_SIZE];
struct gpio_v2_line_config config;
__u32 num_lines;
__u32 event_buffer_size;
/* Pad to fill implicit padding and reserve space for future use. */
__u32 padding[5];
__s32 fd;
};
LOL.
Well I don't know about you, but I'm taking a break.
Yeah, I have plans for later today. But I'll keep at it throughout the coming weeks. I think I'll finish the clang-format updates first.
I've also been looking at the libgpiod code, and they cache everything, possibly because nVidia Jetson/TX2/etc (& probably some RPi clones) can have multiple character devices (dev/gpiochipX
) for GPIO ports.
Apparently, MRAA can detect RPi5 hardware, but I don't know if that means MRAA will completely work on RPi5. They have been supporting character device API for some time...
Pigpio (https://github.com/joan2937/pigpio/issues/586) and BCM2835 (google groups discussion) libs are still assessing how to implement compatibility. I doubt BCM2835 will go on supporting the new hardware paradigm (which ultimately requires interfacing with PCI express). Pigpio might move forward since it already uses the Linux kernel for some stuff.
Remember, WiringPi is dead and littleWire has been broken since before I joined nRF24 org. It is starting to look like our SPIDEV driver might be the only way to support Linux in the future. I'm not saddened by this from a maintenance point of view, but user projects will suffer this new RPi hardware paradigm.
libgpiod's asynch_watch_line_value.c example seems like a big clue about using pthread with char-dev API.
Yeah, I had a feeling the GPIO changes would be slower than previous implementations, but never did any performance testing. Now that we have some working code for v2, caching everything shouldn't be a big challenge.
I took a break. From a design perspective, I think the gpio caches and interrupt implementation should be separate to reduce the created thread's resources.
As far as caching, I think we can just have 1 gpio_line_request
and use the gpio_line_request.attrs
to configure multiple offsets
as input or output. I'm also considering 1 cached gpio_line_request
for input lines and another for output lines. Not sure about GPIO destructor though since the cached FDs would have to be closed upon before app exit.
I might be entirely overthinking again.
Hey if it works it works. We should be able to get the same or better performance out of it than prior versions I would hope.
I agree with keeping the GPIO and IRQ stuff separate. The caching I don't know about, I'd have to look at GPIO code more in depth to form a relevant opinion.
So I think I'm caching the FDs properly (in char-dev-irq branch).
- I have a
gpio_cache
struct closing the FDs in it's destructor, so there shouldn't be any need to modify RF24's destructor. - I'm only using 1 cached
gpio_v2_line_request
obj for all pins used by RF24; usesrequest.config.attrs[offset].attr.flags
instead ofrequest.config.flags
. - I also cache 1
gpio_v2_line_values
obj for all pins' input/output ops.
It compiles and executes without errors, but running the scanner example shows no signal gets detected. 😞
On a side note, I think our old sysfs approach does not properly free the pins upon app exit. I have to reboot my RPi to allow the char-dev API access to the GPIO22.
Nice work!
Mostly tests fine for me, but I am seeing differences in the scanner example on RPi5, RPi4 and RPi3. The RPi2 seems to not care.
Teh RPI4 comes up with long lines like this sometimes tho:
666666655555555555554444444444433333333333333333333333333333333333333333333333333333333333333333333333333334444444444445556666
Also, on a side note, since the FDs are cached and remain open, the driver now gives a nice error report if you try to run two instances of RF24 at the same time, using the same pins. Sweet deal, this is something I did regularly when running through tests etc. so its a nice behavior to have.
Hmm, taking a look at RF24.cpp, I found something unusual that I haven't thought about in a while, but see the lines here
Essentially, on faster devices, I put in a delay of 5us when toggling the CS pin. If I modify the code to the following, it seems to work much better on RPi with the scanner example.
#if !defined(RF24_LINUX)
digitalWrite(csn_pin, mode);
#else
static_cast<void>(mode); // ignore -Wunused-parameter
#endif // !defined(RF24_LINUX)
delayMicroseconds(csDelay); // Delay for all devices
}
Maybe we could do something like the following for the CS pin?:
#if (!defined(F_CPU) || F_CPU > 20000000) && defined FAILURE_HANDLING
delayMicroseconds(csDelay);
#endif
The whole point of the delay was double: a: Ensure the pin gets toggled for at least 5us b: Slow down the polling via SPI, leaving the radio to process radio data instead of being hammered by SPI requests
The higher layers seem to perform better too with this change.
I was going to play with char-dev debouncing too. I think with caching, we've hit the too fast problem.
its a nice behavior to have
I also set the consumer string in the request
obj. So, now if you run gpioinfo gpiochip0
(a tool provided by libgpiod) while a RF24 app is running, you'll see which pins are consumed by the "RF24 lib".
I think with caching, we've hit the too fast problem.
Yup, there needs to be some sort of GPIO/CSN delay for Linux now along with the faster MCUs. Will you include this in your branch then?
Yeah, coding the debouncing now. There's a limit of 10 attrs that we can configure for each request.config
, so I'm going with 3: 1 attr to declare which pins are inputs, 1 attr to declare which pins are outputs, and 1 attr to declare the debouncing period of 5 microseconds on each output pin.
I think only 1 type of attr (input flag, output flag, debounce period) can be associated with a pin. Meaning, setting an attr for output and another attr for debouncing on 1 pin doesn't take.
/**
* struct gpio_v2_line_config - Configuration for GPIO lines
* @flags: flags for the GPIO lines, with values from &enum
* gpio_v2_line_flag, such as %GPIO_V2_LINE_FLAG_ACTIVE_LOW,
* %GPIO_V2_LINE_FLAG_OUTPUT etc, added together. This is the default for
* all requested lines but may be overridden for particular lines using
* @attrs.
* @num_attrs: the number of attributes in @attrs
* @padding: reserved for future use and must be zero filled
* @attrs: the configuration attributes associated with the requested
* lines. Any attribute should only be associated with a particular line
* once. If an attribute is associated with a line multiple times then the
* first occurrence (i.e. lowest index) has precedence.
*/
struct gpio_v2_line_config {
__aligned_u64 flags;
__u32 num_attrs;
/* Pad to fill implicit padding and reserve space for future use. */
__u32 padding[5];
struct gpio_v2_line_config_attribute attrs[GPIO_V2_LINE_NUM_ATTRS_MAX];
};
My initial (local) test has not shown any difference. I pushed what I have in case you can test with it. I also tried using 1 attr to define both output flag and debouncing period (for all output pins), alas I still don't see a difference.
I might have to switch to a std::map
of port
-> request
key/value pairs for each pin used. Alternatively, I guess we could try altering RF24.cpp, but I was hoping to avoid more ifdef
soup.
Yeah, I just tested it and I don't think debounce introduces a delay of 5us after toggling, it would be to prevent togging more often than 5us, so I don't think it will work in this application.
We might need an actual delay. Unfortunately I've been thinking about adding another, separate define to specifically make more functions interrupt safe, as right now all you can do is comment out FAILURE_HANDLING
. It might be best left as is though lol.
My std::map
attempt yielded same result. Its a pain when Linux kernel doc strings are so terse and online tutorials are scarce or outdated.
I think the debounce period is more for input pins. https://www.kernel.org/doc/html/v4.17/driver-api/gpio/driver.html#gpios-with-debounce-support
I added the delayMicroseconds()
call to RF24::ce()
. 😞 https://github.com/nRF24/RF24/commit/18045c7d4e111de1c6bc6d977e6ab3a07fa93d3c
We should probably have a RF24_SPIDEV
defined so we only delay 5us on CE toggle when using the SPIDEV driver. Other Linux drivers are slow enough to not have this problem.
Seems to be working now. I'm going to take a break before tackling the IRQ implementation...
I'm still getting the same erroneous results with the scanner examples with your current code.
The CE pin is only toggled on transmit, and delays are added in the lib already if needed. re: startWrite() Adding a delay to CS will affect reception as well as how often available() can be called etc.
But the gpio stuff in SPIDEV isn't managing the CS. The asserted CE used for entering RX is how I perceived the problem with the scanner example. Works fine on my RPi3.
ok, I'm getting periodic failures in gettingstarted on RPi3 followed by very high transmission times. I'm not getting any errors on my RPi4 with gettingstarted. Is this similar to what you're seeing?
I applied your idea locally
diff --git a/RF24.cpp b/RF24.cpp
index d4a7b33..7d7836e 100644
--- a/RF24.cpp
+++ b/RF24.cpp
@@ -92,10 +92,11 @@ void RF24::csn(bool mode)
#if !defined(RF24_LINUX)
digitalWrite(csn_pin, mode);
- delayMicroseconds(csDelay);
+ //delayMicroseconds(csDelay);
#else
static_cast<void>(mode); // ignore -Wunused-parameter
#endif // !defined(RF24_LINUX)
+ delayMicroseconds(csDelay);
}
/****************************************************************************/
@@ -108,7 +109,7 @@ void RF24::ce(bool level)
#endif
digitalWrite(ce_pin, level);
#ifdef RF24_LINUX
- delayMicroseconds(5);
+ //delayMicroseconds(5);
#endif
#ifndef RF24_LINUX
}
It yielded the same result but much slower scanner sweeps. The gettingstarted transmission times were considerably slower but ~more~ just as reliable on my RPi3
Transmission successful! Time to transmit = 1042 us. Sent: 0
Transmission successful! Time to transmit = 1047 us. Sent: 0.01
Transmission successful! Time to transmit = 1066 us. Sent: 0.02
Transmission successful! Time to transmit = 1047 us. Sent: 0.03
Transmission successful! Time to transmit = 1027 us. Sent: 0.04
Transmission successful! Time to transmit = 1031 us. Sent: 0.05
Transmission successful! Time to transmit = 1039 us. Sent: 0.06
Transmission successful! Time to transmit = 2752 us. Sent: 0.07
Transmission successful! Time to transmit = 2658 us. Sent: 0.08
Transmission successful! Time to transmit = 2721 us. Sent: 0.09
Transmission successful! Time to transmit = 1024 us. Sent: 0.1
Transmission failed or timed out
Transmission successful! Time to transmit = 4387 us. Sent: 0.11
Transmission failed or timed out
Transmission successful! Time to transmit = 2675 us. Sent: 0.12
Transmission successful! Time to transmit = 1039 us. Sent: 0.13
Transmission successful! Time to transmit = 9715 us. Sent: 0.14
Transmission successful! Time to transmit = 999 us. Sent: 0.15
Transmission successful! Time to transmit = 1038 us. Sent: 0.16
Transmission failed or timed out
Transmission failed or timed out
Transmission successful! Time to transmit = 1039 us. Sent: 0.17
Transmission successful! Time to transmit = 1023 us. Sent: 0.18
Transmission failed or timed out
Transmission failed or timed out
6 failures detected. Leaving TX role.
You are correct I figure, forgot to add the CE delay to RF24.cpp when testing. Now the scanner examples seem to work fine!
My gettingstarted examples seem to work fine with or without the CE delay.
It seems I was getting better high-speed transfers at the RF24Gateway layer with the CS delay. Testing with data over SSH and IPerf3. Now I don't know what the answer is. :p
I'm at a loss as well. This is why I can't have nice things.
I think the CE delay makes more logical sense, so lets go with that for now. We can always adjust later if needed.