pi4j-v2 icon indicating copy to clipboard operation
pi4j-v2 copied to clipboard

FFM API code base

Open DigitalSmile opened this issue 7 months ago • 9 comments
trafficstars

This PR is work in progress, will be adding new implementations step by step.

  • [ ] Overall design
  • [x] Digital Input
  • [x] Digital Output
  • [x] i2c via smbus
  • [x] i2c via ioctl
  • [x] i2c via file write/read
  • [x] spi
  • [x] hardware pwm
  • [x] uart

Some implementation details:

  • ~~I am using my own library to simplify generating of structs and calls with FFM API (https://github.com/DigitalSmile/native-memory-processor). Library is using annotation processing to generate source files and boilerplate. The only compile dependency is small footprint annotations JAR.~~
  • wrote unit tests using gpio-sim kernel module, which works only on Linux and require some additional setup with scripts. Restricted the run only with Linux hosts
  • for the source of structs/functions I am using my local kernel headers with specific version, which is specified by a system property in POM file. Can be easily be overridden in CI if needed.
  • added simple JMH test. To run you should invoke clean install and then jmh:benchmark

DigitalSmile avatar Mar 29 '25 20:03 DigitalSmile

For the history, benchmark for the simple roundtrip (create, get state, shutdown):

Result "com.pi4j.plugin.jmh.DigitalInputPerformanceTest.testCreateShutdownRoundTrip":
  1.652 ±(99.9%) 0.121 ms/op [Average]
  (min, avg, max) = (1.615, 1.652, 1.698), stdev = 0.031
  CI (99.9%): [1.531, 1.772] (assumes normal distribution)

DigitalSmile avatar Mar 31 '25 06:03 DigitalSmile

For the history, benchmark for the simple roundtrip (create, get state, shutdown):

Result "com.pi4j.plugin.jmh.DigitalInputPerformanceTest.testCreateShutdownRoundTrip":
  1.652 ±(99.9%) 0.121 ms/op [Average]
  (min, avg, max) = (1.615, 1.652, 1.698), stdev = 0.031
  CI (99.9%): [1.531, 1.772] (assumes normal distribution)

Perhaps you can add a benchmark for the existing provider?

eitch avatar Mar 31 '25 13:03 eitch

For the history, benchmark for the simple roundtrip (create, get state, shutdown):

Result "com.pi4j.plugin.jmh.DigitalInputPerformanceTest.testCreateShutdownRoundTrip":
  1.652 ±(99.9%) 0.121 ms/op [Average]
  (min, avg, max) = (1.615, 1.652, 1.698), stdev = 0.031
  CI (99.9%): [1.531, 1.772] (assumes normal distribution)

Perhaps you can add a benchmark for the existing provider?

That's for the the stage 2 :) I tried to quickly do it and failed, due to lack of build targets for desktop (amd64) of natives. I can try to run the benchmark on a real hardware, but since we are testing in comparison with other approaches, it is better to stay with gpio-sim. Anyway will try to build locally later on.

DigitalSmile avatar Mar 31 '25 13:03 DigitalSmile

Okay, a bit of journey on benchmarks :)

TLDR

For libgpiod (Java code -> JNA Code -> libpi4j-gpiod.so (JNI-wrapper) -> libgpiod.so (native library) -> GPIO kernel syscalls)

Result "com.pi4j.plugin.gpiod.jmh.DigitalInputPerformanceTest.testCreateShutdownRoundTrip":
  0.252 ±(99.9%) 0.015 ms/op [Average]
  (min, avg, max) = (0.248, 0.252, 0.257), stdev = 0.004
  CI (99.9%): [0.236, 0.267] (assumes normal distribution)

For FFM API (Java code -> GPIO kernel syscalls)

Result "com.pi4j.plugin.ffm.jmh.DigitalInputPerformanceTest.testCreateShutdownRoundTrip":
  0.167 ±(99.9%) 0.008 ms/op [Average]
  (min, avg, max) = (0.165, 0.167, 0.170), stdev = 0.002
  CI (99.9%): [0.159, 0.175] (assumes normal distribution)

Bit of details

So first things first I built manually all gpiod natives (that was a lot more easier without maven) and pointed to the library folder in the code of NativeLibraryLoader class. Then I had to comment out lines in GpioDContext to prevent failing since I am not running Pi (btw, why there is such a limitation?). My naive approach was to write the similar test like in https://github.com/Pi4J/pi4j/blob/e1f3079e19c0025271cb7bb947b6e05023e86038/plugins/pi4j-plugin-ffm/src/test/java/com/pi4j/plugin/jmh/DigitalInputPerformanceTest.java The problems begun immediately with SIGSEV from JVM, indicating that memory is corrupted. malloc(): unsorted double linked list corrupted I digged and found out, that threads created by monitorLineEvents are usually freeze and unable to shutdown properly. It seems the heap goes crazy in between native calls <-> monitoring thread <-> main java code. When I commented out the monitoring submission to ExecutorService all benchmarks went fine giving me above result.

So for the sake of comparison I did the same thing with FFM code (commented out monitoring thread creation) and rerun benchmark.

The results are as expected almost similar, with a slight faster with FFM. That happens, just because the roundtrip between native world and Java world is shorter. Anyway, all above is jfyi, so I don't think we need to take any action items on that. I will continue with feature implementations. Next is i2c.

DigitalSmile avatar Apr 01 '25 18:04 DigitalSmile

Nice write up! Thanks a lot for that insight!

eitch avatar Apr 02 '25 06:04 eitch

@taartspi just for the record - SPI classes are just the boilerplate code now, I copy-pasted it for future reference. You can have a look on my implementation here (which I'll be using): https://github.com/DigitalSmile/gpio/blob/main/src/main/java/org/digitalsmile/gpio/spi/SPIBus.java

DigitalSmile avatar Apr 03 '25 07:04 DigitalSmile

Implemented i2c base classes, need further testing.

DigitalSmile avatar Apr 10 '25 16:04 DigitalSmile

Created SPI and Hardware PWM

DigitalSmile avatar Jun 13 '25 16:06 DigitalSmile

New performance tests (GPIO, I2C SMBus, SPI)

Benchmark                                           Mode  Cnt   Score    Error  Units
GPIOPerformanceTest.testInputRoundTrip              avgt    5   0.169 ±  0.005  ms/op
GPIOPerformanceTest.testInputWithListenerRoundTrip  avgt    5  11.766 ± 71.672  ms/op
GPIOPerformanceTest.testOutputRoundTrip             avgt    5   0.042 ±  0.003  ms/op
I2CPerformanceTest.testSMBusRoundTrip               avgt    5   0.035 ±  0.001  ms/op
SPIPerformanceTest.testWriteReadRoundTrip           avgt    5   0.037 ±  0.002  ms/op

Details

What is measured: roundtrip throughput from building Pi4J context to shutdown, including simple operations (write, read or asking state). Tests are run with mock linux drivers, which are simply echoing the result of write operations: gpio-sim, i2c-stub, self made spi-mock (https://github.com/Pi4J/pi4j/blob/ffm-api/plugins/pi4j-plugin-ffm/src/test/native/spi-mock.c).

Analysis

GPIO Output, I2C and SPI are working as expected - the foot print and roundtrip time is minimal. GPIO Input with state change listener consumes more time, because listener is implemented in a separate thread and there is thread management involved. GPIO Plain Input has higher timings, need to investigate why it is times slower than GPIO Output.

If someone can write JMH tests for current implementations of gpiod/sysfs/pigpio that would be very nice.

DigitalSmile avatar Jun 14 '25 13:06 DigitalSmile

Looks the merge is complete and ffm updates are ready for HW testing

taartspi avatar Sep 23 '25 18:09 taartspi

So i'll update develop to 4.0.0-SNAPSHOT and then merge this branch.

Any objections? @FDelporte @DigitalSmile @taartspi

eitch avatar Sep 24 '25 07:09 eitch

@eitch the branch is already on <version>4.0.0-SNAPSHOT</version>. So yes, please, let's merge and speed up testing 🚀 There is only one direction: FORWARD 😉

FDelporte avatar Sep 24 '25 07:09 FDelporte

Done.

eitch avatar Sep 24 '25 07:09 eitch