Device crashes with addRepeatingEvent , trying to run over-voltage/over-current/over-temp script
I want to set up some protection logic for over-current, over-voltage and over-temperature situations after seeing that my Tuya plug runs really hot when the dishwasher drying cycle is active, the internal temperature reaches over 90° while it's 50° normal and that can't be healthy for the capacitors. It made me realize that there is probably no protection against thermal runaway events and it will just destroy itself and in worst case potentially cause a fire.
So I tried to make both soft and hard limits for voltage, current and temperature. Hard limit turns off the relay immediately, soft limit increases a counter when it's exceeded and turns off the relay.
I managed to write the following script:
// ############# Over voltage/current/temp protection ##################
// Using:
// - Channel 1 for relay
// - Channel 20 for soft voltage counter
// - Channel 21 for soft current counter
// - Channel 22 for soft temperature counter
// - Hard voltage cutoff 250
// - Hard current cutoff 20
// - Hard temperature cutoff 95
// - Soft voltage cutoff 245 for 30 ticks
// - Soft current cutoff 16 for 30 ticks
// - Soft temperature cutoff 90 for 30 ticks
// What to do when having over voltage, over current or over temp
alias over_voltage backlog SetChannel 1 0; echo Over voltage ($voltage) detected, turning off relay
alias over_current backlog SetChannel 1 0; echo Over current ($current) detected, turning off relay
alias over_temp backlog SetChannel 1 0; echo Over temperature ($intTemp) detected, turning off relay
// ---- Hard checks ----
// Voltage
alias check_hard_over_voltage if $CH1==1&&$voltage>250 then over_voltage
// Current
alias check_hard_over_current if $CH1==1&&$current>20 then over_current
// Temperature
alias check_hard_over_temp if $CH1==1&&$intTemp>95 then over_temp
// ---- Soft checks ----
// Voltage
SetChannelLabel 20 "Soft voltage threshold counter"
alias over_soft_voltage_counter_reset setChannel 20 0
alias over_soft_voltage_counter_add backlog setChannel 20 $CH20+1
alias over_soft_voltage_counter_rem_min setChannel 20 $CH20-1
alias over_soft_voltage_counter_rem if $CH20>0 then over_soft_voltage_counter_rem_min
alias check_soft_over_voltage_counter if $CH1==1&&$voltage>245 then over_soft_voltage_counter_add else over_soft_voltage_counter_rem
alias check_soft_over_voltage backlog check_soft_over_voltage_counter; if $CH1==1&&$CH20>=30 then over_voltage
over_soft_voltage_counter_reset
// Current
SetChannelLabel 21 "Soft current threshold counter"
alias over_soft_current_counter_reset setChannel 21 0
alias over_soft_current_counter_add backlog setChannel 21 $CH21+1
alias over_soft_current_counter_rem_min setChannel 21 $CH21-1
alias over_soft_current_counter_rem if $CH21>0 then over_soft_current_counter_rem_min
alias check_soft_over_current_counter if $CH1==1&&$current>16 then over_soft_current_counter_add else over_soft_current_counter_rem
alias check_soft_over_current backlog check_soft_over_current_counter; if $CH1==1&&$CH22>=30 then over_current
over_soft_current_counter_reset
// Temperature
SetChannelLabel 22 "Soft temperature threshold counter"
alias over_soft_temp_counter_reset setChannel 22 0
alias over_soft_temp_counter_add backlog setChannel 22 $CH22+1
alias over_soft_temp_counter_rem_min setChannel 22 $CH22-1
alias over_soft_temp_counter_rem if $CH22>0 then over_soft_temp_counter_rem_min
alias check_soft_over_temp_counter if $CH1==1&&$intTemp>31 then over_soft_temp_counter_add else over_soft_temp_counter_rem
alias check_soft_over_temp backlog check_soft_over_temp_counter; if $CH1==1&&$CH22>=30 then over_temp
over_soft_temp_counter_reset
// --- Run checks ---
// define alias to do all the checks
alias hard_protection_check backlog check_hard_over_voltage; check_hard_over_current; check_hard_over_temp
alias soft_protection_check backlog check_soft_over_voltage; check_soft_over_current; check_soft_over_temp
This seems to work fine when I test it (with lowered values) and run the soft_protection_check command. However when I run addRepeatingEvent 1 -1 soft_protection_check to check it every second the device crashes and reboots. I've tried increasing to a 5 seconds interval and it still crashes. I'm not really sure what is the cause, especially if I run the command manually it works as it should. Is there a way to debug this? Logs don't really give me a clue. The hard_protection_check works fine in a addRepeatingEvent so it's specificially related to the more complex soft logic
I'm not super familiar with writing a script, I've used the reference on the github as a guide so if you see something in the script that can be optimized or be done better please let me know as well.
(I tried to post this on the Elektroda forum as a topic but for some reason I kept getting 403 when it tried to generate a subject and then failed to post)
EDIT: Fixed an error in the soft check, I defined an alias twice
This is probably related to https://github.com/openshwprojects/OpenBK7231T_App/issues/1663 As a suggestion, you can flash OpenBK7231x_ALT firmware (it is built with new SDK, and timer stack size was increased) Be aware though, that in new SDK there is some bug, that when logging is enabled it causes spikes in voltage/current data. If that fixes the problem, set loglevel to 0 in startup command.
Is the stack size chip dependent? I have a couple with ECR6600 and a few with LN882H as mcu.
Where would i find the alt version? I'll give it a shot to see if that solves it
No, it on BK7231T and BK7231N only. I don't know if LN or ECR are affected though. ALT is in release assets, like https://github.com/openshwprojects/OpenBK7231T_App/releases/download/1.18.135/OpenBK7231N_ALT_1.18.135.rbl
Thanks, for reference the issue I have is on a LN882H. I haven't tested it on a ECR6600 yet.
Will see if I can get the alt running on it.
The hard limits do work so at least some protection is in place, so it's not that big of a deal but it would be nice to have a tolerant zone for spikes so it isn't too aggressively turning off the relay
LN882H timer stack size is only 512, 6 times less than on BK7231. SDK patch is needed here: https://github.com/openshwprojects/OpenLN882H/blob/a5937fca5948c4e661b95ae3343f5e455cdca8e9/components/kernel/FreeRTOS/Source/include/FreeRTOSConfig.h#L147
There is no ALT for LN882H, only for Beken. ECR might not be affected, its timer stack size is 4096.
Ah that's unfortunate. Can I increase the stack size and recompile or would I hit hardware limitations?
Alternatively I could write a driver that does the monitoring, I can't imagine the stack would be very big for it when written in C.
I already had everything forked, so here you go https://github.com/NonPIayerCharacter/OpenBK7231T_App/actions/runs/16324453684
Hey, thank you for testing, if you confirm that it works good with larger stack size, please open PR so I can add this change to the main tree
I've tested the artifact and it does schedule it correctly without an issue. It triggers the relay off just like I programmed it to so the code runs fine.
However, the HTTP webserver breaks after about 30 seconds, I just get an ERR_CONNECTION_RESET in the browser. Pinging does work fine, but the webserver becomes completely unresponsive. This is regardless of whether i schedule the check or not, so I'm not sure if that's related to the specific build or a consequence of the increased stack size.
I was able to revert to the prior version Built on Jul 1 2025 06:58:01 version 1.18.130 via OTA within that 30 sec window and now the HTTP webserver is stable again.
Is there a way to store the latest logs to littlefs periodically? Without HTTP access it's kind of hard to see if there are any errors and I don't have the device hooked up to UART atm.
I've just tested the script on 1.18.130 on a ECR6600 and there it runs addRepeatingEvent 1 -1 soft_protection_check but crashes the moment it reaches over_temp, so I'm fairly sure it's related to the stack overflowing, but a bit later because the ECR6600 has more stack size.
As it completed on the aforementioned build with
#define configTIMER_TASK_STACK_DEPTH 6144
That is definitely enough stack size to run the script without issue.
Just need to figure out if it has any side effects, I still don't really know why the webserver is so unstable on the build you made. I have 2 other smart plugs I still need to flash via uart, so I'll test on one of them, put the logtarget to uart1 and read the logs that way and hopefully see what is going on.
Error:CMD:cmd uartInit NOT found (args 115200) Error:CMD:cmd logport NOT found (args 1) Unknown command
Ah dang it, any other way I can read the log? The UART2 pins are not broken out to pads and I can't see myself soldering to the chip tiny pins without destroying it.
Update: the http server seems stable with a completely fresh device, so I'll incrementally start to set it up and see what breaks it.
Update 2: okay weird, I can not reproduce my issues from yesterday, I've set up the device exactly the same and the HTTP webserver just works. It sometimes drops the connection and I need to do the commands multiple times or save autoexec.bat multiple times before it actually takes, but I have that with my other plugs as well. I don't at all see the behaviour that I saw yesterday.
Update 3: I do get
Info:BERRY:be_pcall fail, retcode 3
Info:BERRY:top=3
Info:BERRY:stack traceback:
Info:BERRY:
Info:BERRY:string
Info:BERRY::1:
Info:BERRY: in function `
Info:BERRY:main
Info:BERRY:`
Info:BERRY:stack[1] = type='function' ()
Info:BERRY:stack[2] = type='string' (import_error)
Info:BERRY:stack[3] = type='string' (module 'autoexec' not found)
Info:BERRY:[berry end]
At boot, but it doesn't seem to affect things much.
For reference this is my full autoexec.bat and things seem to be working fine:
// start power monitoring driver
startDriver BL0937
// ############ Date & time handling #################
// start NTP driver for energy stats
startDriver NTP
SetupEnergyStats 1 60 60
// By default NTP uses UTC, that means that energy stats are reset at 01:00 or 02:00 depending on DST
// This calculates whether to use time zone offset 01:00 or 02:00 based on the day and month
// Taken from https://www.elektroda.com/rtvforum/topic4086621-30.html
//CET time zone
alias winter_time ntp_timeZoneOfs 1
alias summer_time ntp_timeZoneOfs 2
winter_time
// summertime
alias check1 if $month>3 then summer_time
alias check2 if $month==3&&31-$mday+$day<=6&&$day>0 then summer_time
alias check3 if $month==3&&31-$mday+$day<=6&&$day=0&&$hour>=2 then summer_time
// wintertime
alias check4 if $month>10 then winter_time
alias check5 if $month==10&&31-$mday+$day<=6&&$day>0 then winter_time
alias check6 if $month==10&&31-$mday+$day<=6&&$day=0&&$hour>=3 then winter_time
alias set_time backlog check1;check2;check3;check4;check5;check6
waitfor NTPState 1
set_time
addClockEvent 04:00 0xff 1 set_time
// ############# Over voltage/current/temp protection ##################
// Using:
// - Channel 1 for relay
// - Channel 20 for soft voltage counter
// - Channel 21 for soft current counter
// - Channel 22 for soft temperature counter
// - Hard voltage cutoff 250
// - Hard current cutoff 20
// - Hard temperature cutoff 95
// - Soft voltage cutoff 245 for 30 ticks
// - Soft current cutoff 16 for 30 ticks
// - Soft temperature cutoff 90 for 30 ticks
// What to do when having over voltage, over current or over temp
alias over_voltage backlog SetChannel 1 0; echo Over voltage ($voltage) detected, turning off relay
alias over_current backlog SetChannel 1 0; echo Over current ($current) detected, turning off relay
alias over_temp backlog SetChannel 1 0; echo Over temperature ($intTemp) detected, turning off relay
// ---- Hard checks ----
// Voltage
alias check_hard_over_voltage if $CH1==1&&$voltage>250 then over_voltage
// Current
alias check_hard_over_current if $CH1==1&&$current>20 then over_current
// Temperature
alias check_hard_over_temp if $CH1==1&&$intTemp>95 then over_temp
// ---- Soft checks ----
// Voltage
SetChannelLabel 20 "Soft voltage threshold counter"
alias over_soft_voltage_counter_reset setChannel 20 0
alias over_soft_voltage_counter_add backlog setChannel 20 $CH20+1
alias over_soft_voltage_counter_rem_min setChannel 20 $CH20-1
alias over_soft_voltage_counter_rem if $CH20>0 then over_soft_voltage_counter_rem_min
alias check_soft_over_voltage_counter if $CH1==1&&$voltage>245 then over_soft_voltage_counter_add else over_soft_voltage_counter_rem
alias check_soft_over_voltage backlog check_soft_over_voltage_counter; if $CH1==1&&$CH20>=30 then over_voltage
over_soft_voltage_counter_reset
// Current
SetChannelLabel 21 "Soft current threshold counter"
alias over_soft_current_counter_reset setChannel 21 0
alias over_soft_current_counter_add backlog setChannel 21 $CH21+1
alias over_soft_current_counter_rem_min setChannel 21 $CH21-1
alias over_soft_current_counter_rem if $CH21>0 then over_soft_current_counter_rem_min
alias check_soft_over_current_counter if $CH1==1&&$current>16 then over_soft_current_counter_add else over_soft_current_counter_rem
alias check_soft_over_current backlog check_soft_over_current_counter; if $CH1==1&&$CH22>=30 then over_current
over_soft_current_counter_reset
// Temperature
SetChannelLabel 22 "Soft temperature threshold counter"
alias over_soft_temp_counter_reset setChannel 22 0
alias over_soft_temp_counter_add backlog setChannel 22 $CH22+1
alias over_soft_temp_counter_rem_min setChannel 22 $CH22-1
alias over_soft_temp_counter_rem if $CH22>0 then over_soft_temp_counter_rem_min
alias check_soft_over_temp_counter if $CH1==1&&$intTemp>90 then over_soft_temp_counter_add else over_soft_temp_counter_rem
alias check_soft_over_temp backlog check_soft_over_temp_counter; if $CH1==1&&$CH22>=30 then over_temp
over_soft_temp_counter_reset
// --- Run checks ---
// define alias to do all the checks
alias hard_protection_check backlog check_hard_over_voltage; check_hard_over_current; check_hard_over_temp
alias soft_protection_check backlog check_soft_over_voltage; check_soft_over_current; check_soft_over_temp
addRepeatingEvent 1 -1 backlog hard_protection_check; soft_protection_check
// ############ Home assistant ################
// Make sure Flags 2 and Flags 10 are checked. Do not set Flag 21 (https://community.home-assistant.io/t/mqtt-sensor-state-only-updating-when-reloading-config/643521/11) or HA will set the state correctly after triggering
// schedule HA discovery to update state when connected to MQTT
scheduleHADiscovery 30
// Publish state and such every 60 seconds
mqtt_broadcastInterval 60
// Schedule a HA discovery every 5 minutes. When HA or MQTT restarts the config is gone unless it is republished
addRepeatingEvent 300 -1 scheduleHADiscovery 5
# enable power saving mode
Powersave 1
Well I just got the same again when powering from mains voltage. It worked fine when powering from my TTL to UART, but on mains voltage it seems to interfere somehow. I don't get request timeouts from pings
This seems to line up with someone else having the same issue: https://www.elektroda.com/rtvforum/viewtopic.php?p=21602868#21602868 . I don't it's related to the stack size, but something changed after v1.18.130 that makes the wifi unstable when powered from mains voltage (MQTT also stops responding), until it eventually reboots by itself and then it's available again for a bit.
logport is Beken only. https://github.com/NonPIayerCharacter/OpenBK7231T_App/actions/runs/16342185644 This moves LN882H OBK tick to separate task, like how it's done in most platforms. And increased this task stack size by *2, which should affect ECR too.
I'll test it out thanks.
I think the webserver and MQTT no longer responding is caused by HA discovery. It might be coincidence but I was monitoring the logs and not long after Info:MAIN:Will do request HA discovery now. it stopped responding.
I have
// Schedule a HA discovery every 5 minutes. When HA or MQTT restarts the config is gone unless it is republished
addRepeatingEvent 300 -1 scheduleHADiscovery 5
and that seems to correlate to when it stops responding but I don't know why it ran fine for 15min+ on 3.3v directly hmmm
I don't know how it deals with multiple events and stack size/memory
I've updated and running
Built on Jul 17 2025 10:07:14 version _lntimer_4bfcbf5ee38e Online for 11 minutes and 44 seconds
now and so far it seems to work alright. Sometimes the webserver becomes unresponsive but it seems to be able to recover after a bit. I'll monitor it for a while
I'm running the latest build on 4 different plugs (all LN882H), with the checks scheduled every second and everything seems to work fine. No spurious reboots or anything anymore so I think the issue I had is solved.
I'll test the same build on the ECR6600 plugs and it also works without an issue now. Thanks a lot!
Should be safe to make a pull request for it I think
@openshwprojects Integrated this commit into this pull: https://github.com/openshwprojects/OpenBK7231T_App/pull/1713 It was basically fix a for ESP8285 and ESP BL0937
merged, thank you
@drake7707 do you think it might be worth to create a C driver for the feature you mentioned? It seems pretty generic
merged, thank you
@drake7707 do you think it might be worth to create a C driver for the feature you mentioned? It seems pretty generic
I think so. It would allow for better integration and much higher sampling frequency. I don't know how or if Tuya socket firmwares have any kind of protection built in but this feels like such an important safety feature that all smart sockets should come with by default just to have an extra layer of insurance.
Right now my script only echo's that something was amiss, but ideally I want to set a separate channel (type error?) that shows the error on the home page and also emits it to MQTT so I can use home assistant to send me a notification. "Dishwasher was shut down because the socket overheated" sounds like an urgent enough notification to get to go inspect it asap, especially if it normally doesn't do that.
I haven't guarded against whether current is flowing when the relay is off. That could mean faulty readings but also that the relay is fused shut and that should at least warrant a warning as well. There's not much to do other than give a warning/error if the relay is kaput.