openhab-core icon indicating copy to clipboard operation
openhab-core copied to clipboard

Group item (Aggregation) does not update immediately (big delay)

Open michaeljoos72 opened this issue 3 years ago • 24 comments

Hi

This issue I have seen in all the realeases & snapshots and I couldn't find out how to solve.

I have configured group items "All Doors" & "All Windows". My problem is, that the group item doesn't update immediately. Sometimes with a 2-3 seconds delay, sometimes much more. Group members are updating immediately & correct.

Example:

image

Aggregation is: All CLOSED then CLOSED, else OPEN

image

My setup is quite big: 180 Things, 738 items, 100 rules

I don't think it's a performance issue (Raspi 4)

Thanks Michael

michaeljoos72 avatar Mar 07 '21 10:03 michaeljoos72

Can you show an events.log extract of this, when you provoke a change?

Rossko57 avatar Mar 07 '21 12:03 Rossko57

Jep, this is now one example:

image

HmIP_Tuersensor_Esstisch did change, but no event for group item "Doors"

michaeljoos72 avatar Mar 07 '21 13:03 michaeljoos72

There won't be a logged Group Item event unless the Group state changes, I think.

If you are interested in no-change Group Item state updates, you could have to capture those with a rule. But beware you may get multiple updates from a single member event.

Rossko57 avatar Mar 08 '21 12:03 Rossko57

Yes, agree. But this is my question/problem . . . why does the group item not change immediately as soon as a member does update? For me it seems that the group is not triggered with that event.

I have recorded a video: You can see that after about 10s a group member status changes from OFF to ON, Aggregation is set to "One ON then ON else OFF". But the group item does not change to ON.

https://user-images.githubusercontent.com/20517474/110428698-ff8a1580-80a9-11eb-9ccc-96a6235c8ebe.mp4

This is the group item setting:

image

michaeljoos72 avatar Mar 09 '21 06:03 michaeljoos72

Can you supply an events.log showing an event causing a Group state change, with timestamps, so that you can determine if you are looking at an update delay or a UI display refresh lag.

Rossko57 avatar Mar 09 '21 23:03 Rossko57

Small correction from my side --> It is OK in official release 3.0.1, but in latest snapshots something did change.

Today I did install released version 3.0.1 and back to latest snapshot to compare.

events_openHAB 3.0.1.log:

2021-03-10 17:57:39.616 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from CLOSED to OPEN
2021-03-10 17:57:39.620 [INFO ] [hab.event.GroupItemStateChangedEvent] - Item 'Windows' changed from CLOSED to OPEN through HmIP_Fenstersensor_Office
2021-03-10 17:57:43.095 [INFO ] [hab.event.GroupItemStateChangedEvent] - Item 'Windows' changed from OPEN to CLOSED through HmIP_Fenstersensor_Office
2021-03-10 17:57:43.097 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from OPEN to CLOSED
2021-03-10 17:57:45.063 [INFO ] [hab.event.GroupItemStateChangedEvent] - Item 'Windows' changed from CLOSED to OPEN through HmIP_Fenstersensor_Office
2021-03-10 17:57:45.067 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from CLOSED to OPEN
2021-03-10 17:57:47.637 [INFO ] [hab.event.GroupItemStateChangedEvent] - Item 'Windows' changed from OPEN to CLOSED through HmIP_Fenstersensor_Office
2021-03-10 17:57:47.639 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from OPEN to CLOSED

'HmIP_Fenstersensor_Office' is the member item and 'Windows' the group item. Group item is changing immediately (through HmIP_Fenstersensor_Office).

Then I did install latest snapshot. No change in configuration. Same items, same action:

events_openHAB 3.1.0_build_2247.log:

2021-03-10 18:23:58.705 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from OPEN to CLOSED
2021-03-10 18:24:01.852 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from CLOSED to OPEN
2021-03-10 18:24:33.447 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from OPEN to CLOSED
2021-03-10 18:24:34.927 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from CLOSED to OPEN
2021-03-10 18:24:38.035 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from OPEN to CLOSED
2021-03-10 18:24:40.070 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from CLOSED to OPEN
2021-03-10 18:25:06.608 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIP_Fenstersensor_Office' changed from OPEN to CLOSED

The group item did not change at all. Even after 2 minutes no event and no change in UI.

michaeljoos72 avatar Mar 10 '21 17:03 michaeljoos72

Thankyou for the useful logs,and the extra 3.0.1 observation. Your 3.0.1 example indeed looks good, sometimes the Group logs a change even before the triggering Item logs.

Rossko57 avatar Mar 10 '21 20:03 Rossko57

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/oh3-group-state-delayed-for-item-members/120081/2

openhab-bot avatar Apr 02 '21 17:04 openhab-bot

Same behavior in 3.1.0M3.

Stefan136 avatar Apr 02 '21 18:04 Stefan136

Hi,

I face the same issue using OH3.0.1 and a group of Contacts using the One OPEN then OPEN else CLOSED aggregation function. In my case the update takes minutes, up to a hour, and not seconds only 👎 image

Config of the group: image

I'm using other groups with other aggregation functions and they work without problems. E.g. I'm using a Group of type Number:Power with aggregation function SUM and this one is updated roughly every second (on each update of one of my Shellys): image

I'm running OH3.0.1 in a docker container of my Synology NAS. The CPU load generated by OH is quite high: image

Is it possible, that the high CPU load, or the high frequent update of the Number:Power group mentioned above, is blocking the update of the Contacts group?

MikeTheTux avatar Apr 12 '21 11:04 MikeTheTux

It seems that this is a performance issue. At least in my setup. Because of fears for power failure / data loss I always have disabled ZRAM in my system with approx 700 items. CPU-Load with that configuration was all the time between 1.5 and 2.5 (Raspi 4 4GB and 8GB). Now with ZRAM enabled CPU-Load is between 0.1 and 0.3 (!!) and Group Items always working without any delay. And to avoid Power/Data-Loss I bought a Zendure X6 with UPS-Functionality. Quite expensive but it works perfect!

michaeljoos72 avatar Apr 25 '21 08:04 michaeljoos72

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/oh3-group-state-delayed-for-item-members/120081/22

openhab-bot avatar Apr 25 '21 08:04 openhab-bot

I don't think the problem is a performance problem in my case.

I shared my detailed testing on openHAB Community and the time between the group command is send and event is triggered is increasing from seconds up to several hours.

I tried to change the group item state to OFF several times via GUI or API Explorer, but always without success. I can see the commandEvent in the log, but the state doesn't change.

I'm wondering who can support on this topic - seems something is wrong in OH3.

2021-04-25 17:20:34.377 [INFO ] [openhab.event.ItemCommandEvent ] - Item 'gBewaesserung_Aktiv' received command OFF

top - 17:05:28 up 15 days, 20:06,  0 users,  load average: 3.71, 3.83, 3.49
Tasks:   4 total,   1 running,   3 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.0 us,  4.5 sy,  0.0 ni, 68.6 id, 23.9 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7924.5 total,    354.1 free,   5324.4 used,   2246.0 buff/cache
MiB Swap:  23967.2 total,  23103.9 free,    863.4 used.   1904.8 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                     
   96 openhab   20   0 6366520   2.3g   4748 S   0.7  29.7 270:52.46 java                                                                                                        
    1 openhab   20   0    2284     12      0 S   0.0   0.0   0:07.48 tini                                                                                                        
17590 root      20   0    6796   3444   2992 S   0.0   0.0   0:00.00 bash                                                                                                        
17597 root      20   0   10924   3444   2968 R   0.0   0.0   0:00.01 top 

Stefan136 avatar Apr 25 '21 17:04 Stefan136

I can see the commandEvent in the log, but the state doesn't change.

Please take care to understand Group mechanism, to avoid muddying the issue. Sending command to Group does not change Group state. It just gets passed along to members. Of course, commands to member Items may change their states; those member state changes may in turn affect Group state. Just to be clear, the events of interest for Group state are member Item state changes only.

Rossko57 avatar Aug 27 '21 00:08 Rossko57

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/group-state-not-updated-after-members-update/109668/19

openhab-bot avatar Aug 27 '21 00:08 openhab-bot

After updating to OH 3.1.0 I noticed the described delay. Is there any fix planned for this issue? I guess this issue should be tagged as bug. Right? To be honest, in my case it blows off the meaning to use groups. Of course one could implement the behaviour in own rules. It just is a lot of work. A statement would be nice whether this issue is planned to be fixed or not. Just to judge the time to invest in creating own rules.

maja68 avatar Feb 13 '22 17:02 maja68

I have that very same issue after migrating from OH2 to OH3 (3.2.0 Release Build on Docker on Synology). I am also very interested in a fix, since my Alexa keeps alarming, although I closed all Windows. After some seconds when the group item is updated she stops, though :) Sometimes there are a lot of seconds...

soeren-r avatar Feb 14 '22 19:02 soeren-r

In the meantime I experienced an improvement by adding SSD Cache to my Synology NAS. Before the SSD Cache the group was updated with very large and random delay. As there was quite some volume I/O and the CPU was often waiting for I/O operation, I decided to add the SSD Cache. Once this was added, there is only very small waiting time of CPU related to I/O. Since then the group update works fine.

This is just an observation, I hope this helps to reproduce the issue on a developer system.

maja68 avatar May 12 '22 10:05 maja68

@maja68 your problem is probably the same as @michaeljoos72, it's just a performance issue related to disk I/O and my feeling is, that the issue of @soeren-r is also the same (from my experience with virtual machines/docker on NAS systems).

I have also check the test setup from @Stefan136 (for reference, from the thread linke above)

Group:Switch:OR (ON, OFF) gGroupTest "Test Group"
Switch gTestA "Test A [% s]" (gGroupTest)
Switch gTestB "Test B [% s]" (gGroupTest)

And it works perfectly fine, every time one of the members changes, the group updates accordingly (if necessary), and sending commands to the group also results in commands to the individual items. Can you provide additional information to reproduce the issue?

J-N-K avatar May 12 '22 16:05 J-N-K

Hi folks, thanks for your comments on this. If this would be a performance issue, this would be a pity, since my DS218+ does not support SSD Cache. Has anyone with knowledge in virtualization has a recommendation on how to provide better performance on the NAS? New/faster HDDs maybe?

soeren-r avatar May 12 '22 16:05 soeren-r

If google is correct, a DS218+ has only 2 GB RAM. Compared with the current recommendation for running openHABian (RPi4 with 2GB memory - this does not include docker but is a direct installation) 2GB for running docker and all other Syno services, this seems to be underrated and will result in excessive swapping (which is bad, performance-wise).

J-N-K avatar May 12 '22 16:05 J-N-K

Hi @J-N-K, you are right. I upgraded to 10 GB RAM since the beginning. RAM shouldn't be an issue. But I assigned only 3 GB to that specific container. I will try and give it some more, although it says that RAM is only used around 50%. Worth a try to give it some more.

soeren-r avatar May 12 '22 16:05 soeren-r

@soeren-r I don't think it's related to RAM. On my system (DS918+) 8GB RAM are installed and OH runs on the official docker container. As long as the I/O was low the group update worked fine. I even had the effect of having only occasionally updated rrd files. Meaning samples where visible on sitemap value but not on charts. May be the not successful updating of rrd files caused not updating the group? There is a lot of guess, hope this makes sense for someone knowing the code.

maja68 avatar May 12 '22 18:05 maja68

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 11 '22 21:07 stale[bot]

I still have the problem on OH 3.3.0. Running Openhab also on a DS716+ with 8 GB RAM.

helipus avatar Jan 22 '23 19:01 helipus

If you have a setup that can reproduce this issue, please open a new one.

J-N-K avatar Jan 22 '23 22:01 J-N-K

I had the same problem. It is gone since I removed all orpahn objects (https://www.openhab.org/docs/administration/runtime.html#links) as well as migrated all my DSL to JavaScript rules.

MikeTheTux avatar Jan 23 '23 08:01 MikeTheTux

@MikeTheTux Good to know. I will give the orphan objects method a try and will report back. I still have the very same issues on the latest OH3 stable as docker image. I do not plan to migrate all my rules (currently nice and tidy in files) to JavaScript rules.

soeren-r avatar Jan 23 '23 11:01 soeren-r

@soeren-r I was facing the same issue, also in a docker setup with plenty resources available and in combination with a quite low CPU load.

Can you deactivate your rules for a while (e.g. move the files outside of the rules directory) in order to see if it solves the problem?

As I stated, my final solution was the migration towards JavaScript rules. It was some work, but when looking back, I don't miss the DSL language. Also with JavaScript rules, I'm still going for the file based rule support. Means with manageable effort you can migrate your DSL files from Java (DSL) towards JavaScript language - the difference is not that big, the first file to be migrated is the hardest one ;-)

MikeTheTux avatar Jan 23 '23 12:01 MikeTheTux

I removed all orphans and rebooted the container. That of course did not help. It still takes 40 seconds to update the group item.

Then I moved away all .rules files and rebooted the container. Same behaviour. So I guess converting my rules won't help either.

One thing that came into my mind: I also defined all my items in text files as well as most of my things. Where are your items and things defined, @MikeTheTux ?

soeren-r avatar Jan 24 '23 09:01 soeren-r