openhab-addons
openhab-addons copied to clipboard
[rrd4j] Persistence stops saving data suddenly (mapDB not active)
Hi there, For some weeks I have now the problem that the RRD4J persistence stopped saving data suddently. I recognized this because in the temperature graphs the line stopped suddenly. After I restarted openhab everything worked fine for some hours and then it happens always again. Since then I'm troubleshooting this problem and didn't found a solution yet. I tried the forum and discord. https://community.openhab.org/t/all-persistance-stopped-saving-data-suddenly-mapdb-not-active/
The problem got worth from day to day. Now it takes only a few minutes till the persistence is not working anymore.
Expected Behavior
Every minute the value of WT_Arbeitszimmer_IstTemperatur should be save from the rrd4j persistence.
Current Behavior
After a restart it shortly works like expected and then it just stops recording values.
Possible Solution
none found yet.
Steps to Reproduce (for Bugs)
Example:
I restarted openhab at:
2023-10-22 14:20:18.869 [INFO ] [org.openhab.core.Activator ] - Starting openHAB 4.0.3 (build Release Build)
The first value was saved at 14:22 the last value at 14:47. Then it stopped again. (some other values stopped at 14:40 some at 14:48?!)
The rrd4j database file still gets written every minute if I check the filetime:
In total there are 1181 files in the persistence folder.
events.log filtered for this item:
2023-10-22 14:22:13.830 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'WT_Arbeitszimmer_IstTemperatur' changed from NULL to 21.6 °C
2023-10-22 14:34:22.604 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'WT_Arbeitszimmer_IstTemperatur' changed from 21.6 °C to 21.3 °C
2023-10-22 14:37:24.094 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'WT_Arbeitszimmer_IstTemperatur' changed from 21.3 °C to 21 °C
2023-10-22 14:45:02.340 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'WT_Arbeitszimmer_IstTemperatur' changed from 21 °C to 20.6 °C
2023-10-22 14:49:30.676 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'WT_Arbeitszimmer_IstTemperatur' changed from 20.6 °C to 20.5 °C
2023-10-22 15:02:58.618 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'WT_Arbeitszimmer_IstTemperatur' changed from 20.5 °C to 20.9 °C
So the item is changing correctly, the pesistence file is update but nothing is written. Because when I restart openhab the time from 14:47 till the restart stays empty in the graph.
Context
Maybe this helps debbuging the error: The log at the time the peristances stops: (nothing there which helps)
2023-10-22 14:23:27.435 [INFO ] [ab.ui.habpanel.internal.HABPanelTile] - Started HABPanel at /habpanel
2023-10-22 14:37:49.546 [INFO ] [penhab.core.model.script.Anwesenheit] - Veit ist wieder da! Er war 9 Minuten weg.
2023-10-22 14:49:12.161 [INFO ] [automation.script.file.oeffnungen.js] - Durchsage auf: Alexa_Arbeitszimmer_TTS
2023-10-22 14:49:39.052 [INFO ] [automation.script.file.oeffnungen.js] - Durchsage auf: Alexa_Arbeitszimmer_TTS
2023-10-22 14:49:39.056 [INFO ] [automation.script.file.oeffnungen.js] - Durchsage auf: Alexa_Werkstatt_TTS
2023-10-22 15:05:29.488 [INFO ] [automation.script.file.bewegungen.js] - Automatisierung gestartet durch: BW_Westfassade_Nord_Bewegung ON
2023-10-22 15:05:29.490 [INFO ] [automation.script.file.bewegungen.js] - Kamera Garten: Preset 7
Your Environment
I’m running openHAB 4.0.3 on a Synology NAS DS716 with DSM 7.1.1.-42962 in a docker container.
docker container configuration:
in docker:
on server:
the following folder just has a readme:
conf\persistence
conf\services\rrd4j.cfg
:
# configure specific rrd properties for given items in this file.
# please refer to the documentation available at
# https://www.openhab.org/addons/persistence/rrd4j/
#
# default_numeric and default_other are internally defined defnames and are used as
# defaults when no other defname applies
#<dsName>.def=[ABSOLUTE|COUNTER|DERIVE|GAUGE],<heartBeat>,[<minValue>|U],[<maxValue>|U],<sampleInterval>
#<dsName>.archives=[AVERAGE|MIN|MAX|LAST|FIRST|TOTAL],<xff>,<samplesPerBox>,<boxCount>
#<dsName>.items=<comma separated list of items for this dsName>
This issue has been mentioned on openHAB Community. There might be relevant details there:
https://community.openhab.org/t/all-persistance-stopped-saving-data-suddenly-mapdb-not-active/150379/8
I restricted the number of items which have the default strategy everyMinute by adding my own persistence configuration: Since then the system is the first time running since 11 hours with no error.
Strategies {
everyMin : "0 * * * * ?"
every5Min : "*/5 * * * * ?"
}
Items {
* : strategy = everyChange, restoreOnStartup
//Für Klimacharts & Heizung:
LuftfeuchteAbsolut,LuftfeuchteAbsolut_Outdoor,TemperaturenIst,TemperaturIst_Outdoor,LuftfeuchteRelativ_Outdoor : strategy = every5Min
TemperaturenIst*,TemperaturSoll*,LuftfeuchteRelativ*,Ventile* : strategy = everyMin
//Anwesenheit:
AnwesenheitJemand,AnwesenheitPerson1_Delayed,AnwesenheitPerson2_Delayed : strategy = every5Min
}
Still no errors with this strategy. So I think it is a bug.
This issue has been mentioned on openHAB Community. There might be relevant details there:
https://community.openhab.org/t/openhab-4-0-stopped-visualizing-data-in-a-chart/151032/3
This issue has been mentioned on openHAB Community. There might be relevant details there:
https://community.openhab.org/t/all-persistence-stopped-saving-data-suddenly-mapdb-not-active/150379/19
described here why it appears to be rather a terrible rrd4j-performance leading to backlogs-issue than a stopping. On my OpenHAB it continues to write values but seems to take much too long to store the values. As a result, that looks like rrd4j stopped two days ago (after 4 days of running time already).
For full details please read above post.
An extract of what I described there:
2024-04-03 18:22:48.441 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '130.32' with timestamp 1711975214 in rrd4j database
2024-04-03 18:22:51.138 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.93' with timestamp 1711975216 in rrd4j database
2024-04-03 18:22:52.133 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '117.64' with timestamp 1711975216 in rrd4j database
2024-04-03 18:22:54.491 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.94' with timestamp 1711975219 in rrd4j database
2024-04-03 18:22:54.841 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '117.11' with timestamp 1711975219 in rrd4j database
...
2024-04-03 18:26:02.541 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.93' with timestamp 1711975262 in rrd4j database
2024-04-03 18:26:03.441 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '115.18' with timestamp 1711975262 in rrd4j database
2024-04-03 18:26:07.341 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '130.17' with timestamp 1711975265 in rrd4j database
2024-04-03 18:26:07.573 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 153.36 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161567 because a new value (138.94) arrived with the same timestamp.
2024-04-03 18:26:07.822 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.98' with timestamp 1711975265 in rrd4j database
2024-04-03 18:26:10.850 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '118.06' with timestamp 1711975266 in rrd4j database
2024-04-03 18:26:15.655 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '124.79' with timestamp 1711975268 in rrd4j database
...
2024-04-03 18:26:51.246 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.97' with timestamp 1711975286 in rrd4j database
2024-04-03 18:26:56.196 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '115.31' with timestamp 1711975292 in rrd4j database
2024-04-03 18:26:57.005 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.93' with timestamp 1711975292 in rrd4j database
2024-04-03 18:27:00.736 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 137.12 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161620 because a new value (137.2) arrived with the same timestamp.
2024-04-03 18:27:02.023 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.98' with timestamp 1711975295 in rrd4j database
2024-04-03 18:27:03.541 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '133.03' with timestamp 1711975295 in rrd4j database
2024-04-03 18:27:06.241 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '129.44' with timestamp 1711975297 in rrd4j database
2024-04-03 18:27:07.551 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '126.16' with timestamp 1711975299 in rrd4j database
2024-04-03 18:27:09.241 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.97' with timestamp 1711975299 in rrd4j database
2024-04-03 18:27:15.883 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 141.18 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161635 because a new value (134.53) arrived with the same timestamp.
2024-04-03 18:27:16.146 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '128.8' with timestamp 1711975312 in rrd4j database
2024-04-03 18:27:18.546 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '128.24' with timestamp 1711975314 in rrd4j database
2024-04-03 18:27:20.404 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '115.17' with timestamp 1711975316 in rrd4j database
2024-04-03 18:27:22.055 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.92' with timestamp 1711975317 in rrd4j database
2024-04-03 18:27:31.200 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_current' as value '0.92' with timestamp 1711975320 in rrd4j database
2024-04-03 18:28:02.700 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 151.91 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161682 because a new value (145.16) arrived with the same timestamp.
2024-04-03 18:28:05.711 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 135.29 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161685 because a new value (143.78) arrived with the same timestamp.
2024-04-03 18:28:08.698 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 137.24 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161688 because a new value (136.33) arrived with the same timestamp.
2024-04-03 18:28:11.715 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 137.66 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161691 because a new value (145.32) arrived with the same timestamp.
2024-04-03 18:29:00.807 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Discarding value 139.04 for item shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts with timestamp 1712161740 because a new value (140.56) arrived with the same timestamp.
2024-04-03 18:29:15.638 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'shelly3emshellyem3e8db84d67e241019109203_meter1_currentWatts' as value '115.83' with timestamp 1711975320 in rrd4j database
See the first entry logged at 2024-04-03 18:22:48: it is storing a value sent at 2024-04-01 14:40:14 (timestamp 1711975214) and the last one logged at 2024-04-03 18:29:15: it is storing a value sent at 2024-04-01 14:42:00 (timestamp 1711975320)
So in ~6.5 minutes it was able to only store values of ~1.75 minutes (from meanwhile almost two days ago!). And in deed, the OpenHAB graphs show the values up to that very timestamp. But as such, that looks like rrd4j stopped two days ago (after 4 days of running time already)
While those that are being logged with a current timestame 1712161xxx (=the white ones) were all discarded, due to “new value arrived with same timestamp”, but strangely that one is not logged here at all. Maybe it will in 2-3 days, when the processing finally reaches that timestamp.
Looks like I can observe the above for every item. A little difference to the energy meter values above though can be seen for the values that are not reported that often:
2024-04-03 18:29:17.200 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'gShellyDW3ToiletWindow_State' as value '1.0' with timestamp 1711975319 in rrd4j database (again)
2024-04-03 18:29:17.200 [DEBUG] [d4j.internal.RRD4jPersistenceService] - Stored 'gShellyDW3ToiletWindow_State' as value '1.0' with timestamp 1711975320 in rrd4j database
There (with the same delay in timestamps) it looks like each one is processed twice, which looks strange to me as well but could be as expected.