Arduino icon indicating copy to clipboard operation
Arduino copied to clipboard

Expose attachInterruptArg in Arduino.h and updated base for functional interrupts

Open dok-net opened this issue 5 years ago • 29 comments

A C++ std library functional/bind/lambdas based quite clean implementation [of the same functionality as in #6055 (closed) ]

Edit: Since May 4 2019, original PR date, a lot has changed, given the time for testing, input from project members, and general refactoring. So here's what this is today:

The interrupt handling code gets updated to use C++ std::function. C-style bindings are preserved as wrappers. This solves issues with leaking of attached arguments on detachInterrupt or re-attaching; is type-safe; makes std::function a first-class citizen for interrupt service routines; has been tested for code paths all being in IRAM, if std::bind is used, but not lambdas, which currently exhibit a compiler weakness in this area.

dok-net avatar May 04 '19 09:05 dok-net

@d-a-v : could #6039 please take into account all the work and research I've done on fixing the interrupt handling? Please remember, that at heart, #6038 is the cleanest ISR vs. ICACHE_RAM safe implementation at this present stage I have to offer, the caveat being that the user cannot mix functional and non-functional attaches and detaches without care. #6003 provides this extra ease, at the expense of copied struct definition, dependency inversion between FunctionalInterrupt and core_esp8266_wiring_digital and the extra functional field only for the purpose of supporting FunctionalInterrupt. My take is still to go for clean dependencies over ease of use :-)

dok-net avatar May 04 '19 09:05 dok-net

@dok-net Thanks for your work, Lots of nice improvement-PRs including yours are waiting for the bugfix-one-month-late core-2.5.1 release so they can be properly tested / reviewed. 2.5.1 is imminent now (well, after some neverending recent small but mandatory fixing merged commits).

d-a-v avatar May 04 '19 10:05 d-a-v

@dok-net #6039 is orthogonal to your PRs #6003 #6038 #6047. What do you mean with " take into account all the work and research" ? Am I stepping on your toes ? (In the files you modified I fixed a static declared in a .h, not a real issue. On the other hand I added an IRAM_ATTR define which might please your esp8266/esp32 lib :)

d-a-v avatar May 04 '19 12:05 d-a-v

Dear @d-a-v , no, stepping on my toes is much to hard a word, I am just a little panicky that #6039 might cause a merge mess for me, I only wanted to point that out. As far a bug fix vs. feature is concerned, I've created issue report #6049 to hopefully motivate sufficiently why this PR is a bug fix and should be in the very next bugfix release. SoftwareSerial is also pending release reliant on the availability of attachInterruptArg - of course I could back-port, but I'd rather have it all matching up nicely as it does in my own repos' configurations right now. Naturally, I can only try to show that this PR is a relatively minor fix to a real problem without expected side-effects...

dok-net avatar May 04 '19 13:05 dok-net

@dok-net There are #6048+#6055 and this one #6047. Does OP states #6047 replace both ?

Also I understand #5922 should allow functional in IRAM and there are issues about that per @hreintke 's comment https://github.com/esp8266/Arduino/pull/6055#issuecomment-491584558 and it needs to be sorted out.

d-a-v avatar May 15 '19 00:05 d-a-v

@d-a-v

There are #6048+#6055 and this one #6047. Does OP states #6047 replace both ?

Yes, it should work this way (in git).

Also I understand #5922 should allow functional in IRAM and there are issues about that per @hreintke 's comment #6055 (comment) and it needs to be sorted out.

I couldn't quite wrap my head around the irom.text etc. meddling, and unfortunately the above mentioned comment features incomplete code and I can't figure it quite out either - I would like to trust that the pertaining parts of std::function<void()> are either inlined or linker-edited into IRAM as per #5922, and that is sufficient for #6047, unless someone attaches/detaches ISRs from inside executing ISRs, on which opinions vary widely :-)

dok-net avatar May 15 '19 05:05 dok-net

@dok-net I just updated the comment in #6055 to include a complete sketch. If anything still unclear, just let me know.

hreintke avatar May 15 '19 10:05 hreintke

@d-a-v @hreintke To keep you informed, with this std::function based ISR attach code, I see sporadic crashes after reset. So yes, there is still something not in order with the IRQ-safety of std::function and friends.

dok-net avatar Jun 02 '19 16:06 dok-net

@dok-net @d-a-v : I can do further checking on this but am lost in the PR's and associated commits. Do I need only git master + this PR or is it also dependent on others like the Ticker etc ?

hreintke avatar Jun 03 '19 07:06 hreintke

@hreintke I almost can't resist giving an answer that's just as cryptic :-) No, don't worry, I am glad you care! I am putting a lot of work into keeping the PRs clear to merge with master each by themselves. If this doesn't work now and then, yes, me too, I am dissatisfied and disbelieving how long git and the GitHub PRs exist without proper dependency management for the inter-PR dependencies. Stuffing all into a single branch and thus one PR can't quite be the solution, either. Anyway, Ticker has nothing to do with #6049, hence this PR stands alone. For differential diagnosis, you can try #6055 (this includes #6048), which are what I use while this #6047 unfortunately doesn't work.

dok-net avatar Jun 03 '19 09:06 dok-net

@dok-net @d-a-v @igrr Is there a way that, for debug purposes, we can intercept the "missed cache interrupt" and then check whether we are in interrupt code. If no -> continue as usual, if yes -> panic() and output somehow the offending function address ?

hreintke avatar Jun 03 '19 10:06 hreintke

@hreintke, There's no cache miss handler code. Cache fetches are done completely in HW. If the HW state machine can't engage the SPI bus to do the read you get a nonrecoverable HW exception at that point and crash hard. If it can access the bus, it will work since interrupts are just normal codepaths and nothing fancy on these machines.

As a debug only method, you might be able to do something fishy to the SPI bus at the start of our interrupt dispatch code and then undo it on return. Then panics will occur should ISR code attempt to go out to flash. But that's debug only and it will break SPI, WiFi persistence, and anything else that access flash as a data store.

I think it's probably overkill, though. Even the ISR-in-RAM check is a bit pedantic since it is possible for an app to never access flash as data store, or an app could disable IRQs before doing so (and killing any IRQ generated stuff like PWM, etc.).

earlephilhower avatar Jun 03 '19 16:06 earlephilhower

@d-a-v It seems you've some time at your hands today to talk - I've a question: What is the state of the IRAM linker segment manipulations w.r.t. std::function now? I have hacked a fix for EspSoftwareSerial (master HEAD) to use that in the meantime until my bigger PRs get reviewed and merged, and somehow it's working using the FunctionalInterrupts API instead of the C-style attachInterruptArg (in my PRs). Before, it seemed unstable. Also, a standalone test in libraries/FunctionalInterrupt/ that I had seen failing, as I believed due to flash accessing during interrupt service - runs fine. As much as I understood it, std::function<void(void)> are OK, does this depend on the captures in any way?

dok-net avatar Jun 21 '19 14:06 dok-net

As much as I understood it, std::function<void(void)> are OK, does this depend on the captures in any way?

Since #5922, all functional effective code is moved into IRAM. Matching signature is supposed to be: std::function<any(...)>::operator()() const. In this comment I didn't check the capture influence to the generated object name. You can play with https://demangler.com/ and check if your symbol matches _ZNKSt8functionIF*EE*

You can also more easily check from a running sketch whether a lambda is linked IRAM with this trick.

d-a-v avatar Jun 21 '19 15:06 d-a-v

@d-a-v I believe, with the information you provided me, that this PR now contains a safe refactoring of the IRQ code that has the relevant code paths in IRAM. It certainly is a lot cleaner than #6055 .

dok-net avatar Jun 25 '19 07:06 dok-net

A question about your use of std::move(), when using it, argument is supposed to be passed as rvalue reference type&& arg no ? per this "course" (which I find worth being read and re-read :)

d-a-v avatar Jun 25 '19 11:06 d-a-v

@d-a-v No, I don't think so, that's an oversimplification. https://en.cppreference.com/w/cpp/utility/move explains this better. What I am trying to express/achieve is that the move constructors/assignments are selected if available.

dok-net avatar Jun 25 '19 11:06 dok-net

Then to be consistent, we'd need to have both

void attachScheduledInterrupt(uint8_t pin, 
                              std::function<void(InterruptInfo)> scheduledIntRoutine,
                              int mode)

and

void attachScheduledInterrupt(uint8_t pin, 
                              std::function<void(InterruptInfo)>&& scheduledIntRoutine,
                              int mode)

(both using std::move()) (like they do in STL for example) (no asking to this here, because if we go that way we'd need to update the whole API)

d-a-v avatar Jun 25 '19 12:06 d-a-v

@d-a-v it's certainly desirable avoiding copying objects all the time for no good reason - but yes, that's a lot of extra work doing it just for consistency. Are you OK with me using it when doing new things step by step, tho?

dok-net avatar Jun 25 '19 12:06 dok-net

Are you OK with me using it when doing new things step by step, tho?

Of course yes. Just not because we are on a small MCU should we avoid any cheap kind of optimization :)

d-a-v avatar Jun 25 '19 12:06 d-a-v

@d-a-v I was asked by @devyte to test and improve the performance of this patch, such that it is no worse for void(*)(void) ISRs over before. I am now seing a major improvement in performance in my test, but I could have implemented a horrible bug :-) Could you please read core_esp8266_wiring_digital.cpp over again, with a keen eye out on interrupt_handler(void*)? Thanks!

dok-net avatar Oct 31 '19 14:10 dok-net

Some test results. But first, I noticed that in master, set_interrupt_handlers() is missing the ICACHE_RAM_ATTR attribution - which when done right adds a few bytes to IRAM use.

Master including this fix has following memory use for small test case using void(*)(void) ISRs:

Executable segment sizes
IROM   *: 231648          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   *: 28000   \ 32768 - code in IRAM          (ICACHE_RAM_ATTR, ISRs...)
DATA   *: 1272  )         - initialized variables (global, static) in RAM\HEAP
RODATA *: 744   ) \ 81920 - constants             (global, static) in RAM\HEAP
BSS    *: 25224 )         - zeroed variables      (global, static) in RAM\HEAP
Program size: 261,664 bytes (used 25% of a 1,044,464 byte maximum) (27.89 secs)
Minimum Memory Usage: 27240 bytes (33% of a 81920 byte maximum)

It performs two cascaded IRQs with an avg. latency/CPU_cycles = 961.

Just switching core to this branch, the results are:

Executable segment sizes
IROM   *: 231664          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   *: 27960   \ 32768 - code in IRAM          (ICACHE_RAM_ATTR, ISRs...)
DATA   *: 1272  )         - initialized variables (global, static) in RAM\HEAP
RODATA *: 752   ) \ 81920 - constants             (global, static) in RAM\HEAP
BSS    *: 25296 )         - zeroed variables      (global, static) in RAM\HEAP
Program size: 261,648 bytes (used 25% of a 1,044,464 byte maximum) (27.34 secs)
Minimum Memory Usage: 27320 bytes (33% of a 81920 byte maximum)

Performance is: avg. Latency/CPU_cycles = 891.

Next, changing the test case from plain function pointer to std::function for a void(int) curried with an integer value, this is result:

Executable segment sizes
IROM   *: 231952          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   *: 27840   \ 32768 - code in IRAM          (ICACHE_RAM_ATTR, ISRs...)
DATA   *: 1272  )         - initialized variables (global, static) in RAM\HEAP
RODATA *: 732   ) \ 81920 - constants             (global, static) in RAM\HEAP 
BSS    *: 25296 )         - zeroed variables      (global, static) in RAM\HEAP
Program size: 261,796 bytes (used 25% of a 1,044,464 byte maximum) (1.85 secs)
Minimum Memory Usage: 27300 bytes (33% of a 81920 byte maximum)

Performing at avg. Latency/CPU_cycles = 929.

The sketch:

#include <interrupts.h>

uint32_t start;
uint8_t toggle_d8 = HIGH;
uint8_t toggle_d6 = HIGH;

void IRAM_ATTR irq_d7(int x)
{
	toggle_d8 ^= HIGH;
	digitalWrite(D8, toggle_d8);
}

uint32_t sum_delay = 0;
constexpr uint32_t MAXREPEATS = 1000;
uint32_t repeats = 0;
void IRAM_ATTR irq_d5(int x)
{
	auto cpuCycle = ESP.getCycleCount();
	sum_delay += cpuCycle - start;
	++repeats;
	start = cpuCycle;
	if (repeats < MAXREPEATS)
	{
		toggle_d6 ^= HIGH;
		digitalWrite(D6, toggle_d6);
	}
}

void setup()
{
	Serial.begin(9600);
	pinMode(D8, OUTPUT);
	digitalWrite(D8, toggle_d8);
	pinMode(D7, INPUT_PULLUP);
	pinMode(D6, OUTPUT);
	digitalWrite(D6, toggle_d6);
	pinMode(D5, INPUT_PULLUP);
	attachInterrupt(D7, std::bind(irq_d7, 42), CHANGE);
	attachInterrupt(D5, std::bind(irq_d5, 43), CHANGE);
	start = ESP.getCycleCount();
	toggle_d6 ^= HIGH;
	digitalWrite(D6, toggle_d6);
}

void loop()
{
	if (repeats >= MAXREPEATS)
	{
		{
			esp8266::InterruptLock lock;
			Serial.print("avg. Latency/CPU_cycles = ");
			Serial.println(sum_delay / repeats);
			sum_delay = 0;
			repeats = 0;
			start = ESP.getCycleCount();
			toggle_d6 ^= HIGH;
		}
		digitalWrite(D6, toggle_d6);
	}
}

dok-net avatar Oct 31 '19 20:10 dok-net

Latest master compared to rebased PR branch. Master:

Executable segment sizes
IROM   *: 234064          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   *: 27580   \ 32768 - code in IRAM          (ICACHE_RAM_ATTR, ISRs...)
DATA   *: 1244  )         - initialized variables (global, static) in RAM\HEAP
RODATA *: 1600  ) \ 81920 - constants             (global, static) in RAM\HEAP
BSS    *: 25408 )         - zeroed variables      (global, static) in RAM\HEAP
Program size: 264,488 bytes (used 25% of a 1,044,464 byte maximum) (2.74 secs)
Minimum Memory Usage: 28252 bytes (34% of a 81920 byte maximum)

PR:

Executable segment sizes
IROM   *: 234080          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   *: 27544   \ 32768 - code in IRAM          (ICACHE_RAM_ATTR, ISRs...)
DATA   *: 1244  )         - initialized variables (global, static) in RAM\HEAP
RODATA *: 1608  ) \ 81920 - constants             (global, static) in RAM\HEAP
BSS    *: 25480 )         - zeroed variables      (global, static) in RAM\HEAP
Program size: 264,476 bytes (used 25% of a 1,044,464 byte maximum) (29.70 secs)
Minimum Memory Usage: 28332 bytes (35% of a 81920 byte maximum)

Running:

avg. Latency/us = 11.14

Sketch that compiles both in master and PR:

#include <interrupts.h>

uint32_t start;
uint8_t toggle_d8 = HIGH;
uint8_t toggle_d6 = HIGH;

void IRAM_ATTR irq_d7(void* x)
{
	toggle_d8 ^= HIGH;
	digitalWrite(D8, toggle_d8);
}

uint32_t sum_delay = 0;
constexpr uint32_t MAXREPEATS = 1000;
uint32_t repeats = 0;
void IRAM_ATTR irq_d5(void* x)
{
	auto cpuCycle = ESP.getCycleCount();
	sum_delay += cpuCycle - start;
	++repeats;
	start = cpuCycle;
	if (repeats < MAXREPEATS)
	{
		toggle_d6 ^= HIGH;
		digitalWrite(D6, toggle_d6);
	}
}

void setup()
{
	Serial.begin(9600);
	pinMode(D8, OUTPUT);
	digitalWrite(D8, toggle_d8);
	pinMode(D7, INPUT_PULLUP);
	pinMode(D6, OUTPUT);
	digitalWrite(D6, toggle_d6);
	pinMode(D5, INPUT_PULLUP);
	attachInterruptArg(D7, irq_d7, (void*)42, CHANGE);
	attachInterruptArg(D5, irq_d5, (void*)43, CHANGE);
	start = ESP.getCycleCount();
	toggle_d6 ^= HIGH;
	digitalWrite(D6, toggle_d6);
}

void loop()
{
	if (repeats >= MAXREPEATS)
	{
		{
			esp8266::InterruptLock lock;
			Serial.print("avg. Latency/us = ");
			Serial.println(1.0 * sum_delay / repeats / ESP.getCpuFreqMHz());
			sum_delay = 0;
			repeats = 0;
			start = ESP.getCycleCount();
			toggle_d6 ^= HIGH;
		}
		digitalWrite(D6, toggle_d6);
	}
}

EspSoftwareSerial, off-the-shelf loopback.ino example with constexpr int IUTBITRATE = 128000;:

Loopback example for EspSoftwareSerial
tx/rx: 16000/16000
eff. tx: 54060bps, eff. rx: 54060bps, 1 errors (0.01%)
tx/rx: 16000/15997
eff. tx: 55090bps, eff. rx: 55080bps, 6 errors (0.04%)
tx/rx: 16000/16000
eff. tx: 55130bps, eff. rx: 55130bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 55090bps, eff. rx: 55090bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 55040bps, eff. rx: 55040bps, 0 errors (0.00%)
tx/rx: 16000/15996
eff. tx: 55010bps, eff. rx: 54990bps, 2 errors (0.01%)
tx/rx: 16000/16000
eff. tx: 54870bps, eff. rx: 54870bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 55060bps, eff. rx: 55060bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 55040bps, eff. rx: 55040bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 54990bps, eff. rx: 54990bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 54970bps, eff. rx: 54970bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 55040bps, eff. rx: 55040bps, 0 errors (0.00%)
tx/rx: 16000/16000
eff. tx: 55050bps, eff. rx: 55050bps, 2 errors (0.01%)
tx/rx: 16000/16000
eff. tx: 55000bps, eff. rx: 55000bps, 0 errors (0.00%)

dok-net avatar Nov 08 '19 13:11 dok-net

Here's the latest in performance numbers and what the new Delegate class can do for interrupts. See the comments with the measurements below in this working example. Connect D5 to D8, D6 to D7. Pending pushing the updated PR to GitHub:

#include <interrupts.h>

uint32_t start;
uint8_t toggle_d8 = HIGH;
uint8_t toggle_d6 = HIGH;

void IRAM_ATTR irq_d7(void* x)
{
	toggle_d8 ^= HIGH;
	digitalWrite(D8, toggle_d8);
}

void IRAM_ATTR irq_d7_v()
{
	irq_d7(nullptr);
}

uint32_t sum_delay = 0;
constexpr uint32_t MAXREPEATS = 1000;
uint32_t repeats = 0;
void IRAM_ATTR irq_d5(void* x)
{
	auto cpuCycle = ESP.getCycleCount();
	sum_delay += cpuCycle - start;
	++repeats;
	start = cpuCycle;
	if (repeats < MAXREPEATS)
	{
		toggle_d6 ^= HIGH;
		digitalWrite(D6, toggle_d6);
	}
}

void IRAM_ATTR irq_d5_v()
{
	irq_d5(nullptr);
}

void setup()
{
	Serial.begin(9600);
	pinMode(D8, OUTPUT);
	digitalWrite(D8, toggle_d8);
	pinMode(D7, INPUT_PULLUP);
	pinMode(D6, OUTPUT);
	digitalWrite(D6, toggle_d6);
	pinMode(D5, INPUT_PULLUP);
	
	attachInterrupt(D7, irq_d7_v, CHANGE);
	attachInterrupt(D5, irq_d5_v, CHANGE);
	// master / 2.6.2: avg. Latency/us = 12.29
	// PR #6047 + Delegate: avg. Latency/us = 11.42

	//attachInterruptArg(D7, irq_d7, (void*)42, CHANGE);
	//attachInterruptArg(D5, irq_d5, (void*)43, CHANGE);
	// master / 2.6.2: avg. Latency/us = 12.04
	// PR #6047 + Delegate: avg. Latency/us = 11.14

	//attachInterrupt(D7, { irq_d7, (void*)42 }, CHANGE);
	//attachInterrupt(D5, { irq_d5, (void*)43 }, CHANGE);
	// PR #6047 + Delegate: avg. Latency/us = 11.14

	//attachInterrupt(D7, std::bind(irq_d7, (void*)42), CHANGE);
	//attachInterrupt(D5, std::bind(irq_d5, (void*)43), CHANGE);
	// PR #6047 + Delegate: avg. Latency/us = 11.62

	start = ESP.getCycleCount();
	toggle_d6 ^= HIGH;
	digitalWrite(D6, toggle_d6);
}

void loop()
{
	if (repeats >= MAXREPEATS)
	{
		{
			esp8266::InterruptLock lock;
			Serial.print("avg. Latency/us = ");
			Serial.println(1.0 * sum_delay / repeats / ESP.getCpuFreqMHz());
			sum_delay = 0;
			repeats = 0;
			start = ESP.getCycleCount();
			toggle_d6 ^= HIGH;
		}
		digitalWrite(D6, toggle_d6);
	}
}

dok-net avatar Nov 28 '19 17:11 dok-net

Update: I've removed, for the time being, Delegate in favor of plain std::function. Here are the build results, for a FunctionalInterrupt.h, attachScheduledInterrupt example: Master:

ICACHE *: 32768           - flash instruction cache
IROM   *: 237528          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   *: 27653   \ 32768 - code in IRAM          (IRAM_ATTR, ISRs...)
DATA   *: 1496  )         - initialized variables (global, static) in RAM\HEAP
RODATA *: 1000  ) \ 81920 - constants             (global, static) in RAM\HEAP
BSS    *: 25888 )         - zeroed variables      (global, static) in RAM\HEAP

This PR:

ICACHE *: 32768           - flash instruction cache
IROM   *: 237720          - code in flash         (default or ICACHE_FLASH_ATTR)
IRAM   *: 27613   \ 32768 - code in IRAM          (IRAM_ATTR, ISRs...)
DATA   *: 1496  )         - initialized variables (global, static) in RAM\HEAP
RODATA *: 992   ) \ 81920 - constants             (global, static) in RAM\HEAP
BSS    *: 25944 )         - zeroed variables      (global, static) in RAM\HEAP

dok-net avatar Apr 03 '21 09:04 dok-net

Further info, performance of back-to-back IRQs, that is, connecting two pins, and let every change on pin A trigger an interrupt on pin B, where the ISR toggles the state of pin A.

Master @ 80MHz: IRQs/s = 136239 This PR @ 80MHz: IRQs/s = 141242

Sketch:

#include <atomic>

constexpr long unsigned MAXIRQS = 200000;

std::atomic<long unsigned> irqCount;
std::atomic<uint32_t> lastTime;
uint32_t prevTime;

IRAM_ATTR void triggered(void*)
{
	if (irqCount.load())
	{
		irqCount.store(irqCount.load() - 1);
		digitalWrite(D4, !digitalRead(D4));
	}
	else
	{
		lastTime.store(ESP.getCycleCount());
	}
}

void setup()
{
	Serial.begin(115200);
	while (!Serial);
	delay(100);
	pinMode(D4, OUTPUT);
	digitalWrite(D4, LOW);
	pinMode(D7, INPUT);
	attachInterruptArg(D7, triggered, nullptr, CHANGE);

	Serial.println("measuring IRQ latency");
	irqCount.store(MAXIRQS);
	prevTime = ESP.getCycleCount();
	digitalWrite(D4, !digitalRead(D4));
}

void loop()
{
	delay(1000);
	if (!irqCount.load())
	{
		Serial.print("IRQs/s = ");
		Serial.println(1000 * MAXIRQS / ((lastTime.load() - prevTime) / ESP.getCpuFreqMHz() / 1000));
		irqCount.store(MAXIRQS);
		prevTime = ESP.getCycleCount();
		digitalWrite(D4, !digitalRead(D4));
	}
}

dok-net avatar Apr 03 '21 11:04 dok-net

For further motivation, here are the results when substituting Delegate back in for std::function: dok-net:all_in_functional_w_delegate@80MHz: IRQs/s = 144092

dok-net avatar Apr 03 '21 11:04 dok-net

For further motivation, here are the results when substituting Delegate back in for std::function: dok-net:all_in_functional_w_delegate@80MHz: IRQs/s = 144092

Notable improvement is removal of if(functional), arg casts, which is mighty good thing for the Core's ISR handler speed.

With the PR though, doesn't std function's function still need to be explicitly marked as IRAM ? What about lambdas? e.g. attachInterrupt(PIN, [some, captured, args]() { ... }, CHANGE); in the setup(), which then looks like std::_Function_handler<void (), setup::{lambda()#1}>::_M_invoke(std::_Any_data const&) in the objdump

I've tried some alternative approach at https://github.com/esp8266/Arduino/compare/master...mcspr:isr-why Looking at objdump (fairly weird) output, std::function<T>:: _M_invoke is part of the std function operator() call tree (and as described here), so the branch experiments by moving it to IRAM via ldscript rule (but I'm pretty sure it needs to be something else for the sake of schedule_function / anything else using void() std funcs)

mcspr avatar May 18 '21 11:05 mcspr

All the compiler limitations with regard to attributes still apply.

dok-net avatar May 18 '21 12:05 dok-net