AqualinkD
AqualinkD copied to clipboard
SWG settings bounce
This is similar to a recently closed issue #85 regarding SWG's bouncing between X% -> 0 -> X%. I have been having this issue since updating to 1.3.6 although I am not convinced this version is definitively the cause.
Symptom is my Aquarite SWG would stop generating every few minutes (low flow light starts). Then recover and quickly go back. Flow rate is NOT the issue as it was independent of speed. Looking at the AqD logs at the time I would see constant messages moving the SWG from 25% ->0 > 25%. I would also see that sometimes the iAqualink was also not coming back online, and when it did, I would see odd Watt/Speed numbers on the screen every so often. Shutting the AqualinkD process, everything would work fine and the SWG and iAqualink would run as expected. Starting the process and problems would start again.
I focused on the RS485 wiring and configuration and made some changes based on recommendations on the recently closed case. (Pump moved to Terminal block 1)
Terminal Block 2 has the pi and the iAqualink connected to it. First to the pi then daisy chains to the iAqualink. The outer wires go straight through to the iAqualink.
Terminal block 1 has both SWG and Pentair Pump connected directly to it. I also had to install a Chlorinator Translator (locked out) so its basically RS -> Translator -> Aquarite | -> Pentair I also completely changed out the wire to 22 gauge solid core security wire from the 24/28 gauge Cat5 i was using.
This did a lot to clean up a majority instances. However, I am still getting instances at seemingly random times where i am seeing messages of the SWG set to 0, then quickly going back to 25%. These times do NOT correlate with any changes in program schedules at those times.
Is it possible that any changes in code during the last couple of releases could be causing some increased errors ? I have started testing with 1.3.3d to see if i can find anything.
On a related note, which I have seen several times in the past releases as well, that sometimes stopping and restarting the AqualinkD process cases the RS to shut the Pump/System off. Not sure if you have seen this. It happened a couple of times today as I was swapping back and forth between 1.3.3d and 1.3.6 a few times.
There were some changes around 1.3.4 that make the SWG readings instant. Before it would wait for a few duplicate messages before actually notifying you of any change. Now it’s instant, I had to do this to overcome a bug of when people actually set it the percent to 0. So it could have been you always had the SWG reporting 25-0-25 but AqualinkD was ignoring it, as it only got one message at 0. Ill see if I can think of a way round this in the next release. As for pump turning off when you stop AqualinkD, no, I’ve never seen that before. Are you getting checksum errors in the log?
That is possible that this has been happening. I will keep testing with 1.3.3d for a couple of days to see if it changes anything. I do have very infrequent checksum errors. Maybe a couple of times per day. Last time was about 2 hours ago and see this in the logs Sep-30-19 20:19:04 PM Warning:Serial read bad Jandy checksum, ignoring Sep-30-19 20:19:04 PM Warning:Bad receive packet | Jandy | HEX: 0x10|0x02|0x69|0x10|0x02|0x50|0x11|0x19|0x8c|0x10|0x03| Sep-30-19 20:19:59 PM Warning:Serial read bad Jandy checksum, ignoring Sep-30-19 20:19:59 PM Warning:Bad receive packet | Jandy | HEX: 0x10|0x02|0x69|0x10|0x02|0x0a|0x02|0x40|0x14|0x00|0x00|0x00|0x72|0x10|0x03|
I am also tracking the MQTT entries at this time and did not see the SWG values change when the above errors were logged. With the previous problems that I have been seeing the MQTT values do change.
I do see values changed when the system changes modes (transition JVAs between modes) . I have eliminated this as the cause of the other times.
I will investigate if there is an environmental issue causing the error. The box that has the pi and serial adapter also has the iAqualink, and couple of strong static magnets i recently installed to hold a tablet. Possible there is some environmental interaction causing data errors ? The problems seem to manifest with the rest of the system when AqualinkD is running. Maybe a better quality USB converter with optical isolation and good power control or moving the pi and usb converter solves the problem.
Just now playing with the simulator on 1.3.3d (still testing that version) and when I just browse past the Aquapure menu items in System Setup without making changes, the following shows up in the logs. This is also reflected in the MQTT value changing as well and is repeatable:
Oct-01-19 13:33:12 PM Warning:WS: Send raw command to controller 0x18 Oct-01-19 13:33:13 PM Warning:WS: Send raw command to controller 0x18 Oct-01-19 13:33:13 PM Notice: Setting setpoint of Salt Water Generator to 0 Oct-01-19 13:33:13 PM Notice: Setting setpoint of Salt Water Generator to 0 Oct-01-19 13:33:13 PM Notice: SWG 0s already 0, not changing Oct-01-19 13:33:13 PM Notice: Setting setpoint of Salt Water Generator to 25 Oct-01-19 13:33:13 PM Notice: Setting setpoint of Salt Water Generator to 25 Oct-01-19 13:33:13 PM Notice: SWG 25s already 25, not changing Oct-01-19 13:33:15 PM Warning:WS: Send raw command to controller 0x18 Oct-01-19 13:33:16 PM Notice: Setting setpoint of Salt Water Generator to 0 Oct-01-19 13:33:16 PM Notice: Setting setpoint of Salt Water Generator to 0 Oct-01-19 13:33:16 PM Notice: SWG 0s already 0, not changing Oct-01-19 13:33:16 PM Notice: Setting setpoint of Salt Water Generator to 25 Oct-01-19 13:33:16 PM Notice: Setting setpoint of Salt Water Generator to 25 Oct-01-19 13:33:16 PM Notice: SWG 25s already 25, not changing Oct-01-19 13:33:17 PM Warning:WS: Send raw command to controller 0x18
This may just be a bug in the RS Firmware. Problem does not seem to occur when using the Setup Menus on the iAqualink. Since I do not have an RS Control Panel, I cannot tell if its only to simulator.
@sfeakes I think the wiring fixed my issue. After I clean the salt cell and turn the system on it bounces from 75->0->75 and then after a number of hours starts to settle. I have disabled the swg_zero_ignore_count in the aqualinkd.conf file so think that there is some residual zero_count in one of the files which is causing it to bounce. Is there a way to verify that the swg_zero_ignore_count is reset to default?
swg_zero_ignore_count Is only held in memory, so if it’s been removed from the config file, it’s not being used. You can see if it’s a bug in AqualinkD or actual messages from the SWG by turning on serial logging or RS485 debugging. Then when you see it bounce to zero check the log, you’ll see the message the SWG sent.
Still happens with mine. I cleaned up much of the wiring and have about one bounce every couple of hours on top of the expected ones when there is a mode change on my pool . I have not played with the swg_zero_ignore count yet but will try today and capture some more logs.
@sfeakes , do you have any thoughts regarding my above observation that the RS simulator causing the SWG to bounce when browsing past the menus ? Happy to post a video of it happening if that helps.
You can ignore the messages your seeing in the simulator, it’s a bug I introduced in the last release when posting logs into web ui . You will see those exact messages in the logs, I just forgot to turn that off when using the simulator. It’s not the cause of the problem. But if you can post the logs that would be helpful.
aqualinkd.log RS485.log Uploading logs. seems like its still changing SWG %
Thanks for the logs, Just a thought, can you try turning pentair packets off in the config, and see if the problem still exists.
read_pentair_packets = no
I disabled read_pentair_packets earlier this evening and had a couple of instances of the bounces. Do you know if the the SWG settings bounce on a polarity flip? The timing does not align from what i can tell but I don't think that is reported. I will upload some logs in the next day or so.
I think I know what might be happening, but I'll need some raw RS485 packets from your system to prove it (and so I can ultimately fix it). I'm updating AqualinkD so I can capture the information, I'll post a new version later today.
I've just posted V1.3.7, if you can download that and test it out, would be much appreciated. I don't think I've fixed the SWG bounce, but I'd like to get the raw logs from your system.
If you run from the command line with -rsrd
option, a file /tmp/RS485.raw
will be created, if you could post that I hope I can get to the bottom of the problem.
So after running the install script, run with something like the below
sudo ./release/aqualinkd -d -c /etc/aqualinkd.conf -v -rsrd
Wait for the problem to happen a few times and them post /tmp/RS485.raw
, you'll probably have to zip it up as it'll be large.
I will upgrade and test now. Should I re-enable Pentair ?
I have attached logs from last night. There was a bounce at ~04:14:31 . AqualinkD.zip
Yes, please re-enable Pentair. Good catch, forgot to mention raw log will only work with Pentair enabled.
I assume the RS485.log and RS485raw.log are redundant? Right now it is creating both since I did not disable the RS485 logging . Not a big deal, but not sure if you need both or not.
The RS485.log is post-processing, the RS485raw.log is pre-procesing. I can re-create the RS485.log if needed, so no need in posting that, it's the RS485raw.log that I need.
My theory is since the Pentair RS485 packets don't follow standard RS protocol, the Pentair pump doesn't understand the Jandy RS485 packets that do follow standard protocol. So the Pentair pump is posting packets in the middle of Jandy packets, (or visa-versa). And that's what's confusing either the the control panel or AqualinkD. If that's the case, I should be able to code around either, I just need the raw / pre-processed packets to see how it all looks.
BTW, if you have any other Pentair devices on the RS485 buss other than the pump, please turn them on, also play with RPM settings, the more Pentair packets that get in the log the better.
Caught one bounce at ~12:37:16. Logs attached. Sometimes the system is quiet, other times I will get a few bounces within a couple of hours. I am still logging for now and will post another log when I see more activity. AqualinkD131019-1.zip
I just updated a few minutes ago. Will take a look out for bounces and post later.
Latest logs from last night. ~ Timestamps of bounces not related to expected automation mode changes: ~06:30:23 (during spillover time) ~06:40:29 (during spillover time) ~ 07:21:49
Something that I noticed is that my SWG cell on light keeps bouncing (was doing this with 1.3.6 and again 1.3.7), cell on light is on, then turns off for half a second then back on. When I stopped aqualinkd, it stopped bouncing.
I stopped aqualinkd and it stopped bouncing. I then re-enabled aqualinkd to see if it would start acting up again.
Whats weird is my service 120 light came on after re-enabling aqualinkD (cell was cleaned last week).
I'm going to have it run without aqualinkd running for a few hours then re-enable it to see what happens.
It still gave an error (121 this time) when I had aqualinkD disabled. I cleaned the cell again and turned on. error message is now gone so going to run and see what happens for a few hours.
@niharmehta I've spent a few hours going over your logs, this is what looks to be happening. (Wanted to write this down before I forget)
Control panel sets SWG 25%
So the bounce from 25% to 0% back to 25% is accurate to what's actually happening on the RS485 protocol. The control panel is just NOT notifying iAqualink or RS8 of this SWG change however. So it's just hiding what's going on to the user, which is why you've never seen it before.
So, one way to overcome this is to simply use the RS8 messages for the SWG percentage, and not the RS485 packets. The config setting read_all_devices = no
will stop reading any RS485 packets not designated to be read by an RS8 keypad, there for hiding the bounce from AqualinkD
Another way would be to use the swg_zero_ignore_count = 50
in AqualinkD's config, you're logs show about 25 messages a 0 before going back, so setting that to 50 would probably hide the bounce from you.
Preferable, Figure out why your system is doing this with the pump. I'm looking at messages before and after the list I posted, to see if there is anything in their, like Pump bouncing between local & remote control or anything like that. A complete guess is you have a timer set locally on your pump that is overriding the Control Panel settings, and when that timer de-activates and control panel takes over is when you see this bounce.
That is really interesting. I wonder if this is a problem limited to Pentair pumps and potentially some sort of FW bug in the pump firmware (or integration with Jandy) that is normally hidden. It seems randomly odd times to have the pump decide to notify the RS it is shutting off. One point I am confused. I would have expected to see in my MQTT and logs monitoring that the RS control panel is sending 0% for 30s. I only see a brief message in the AqualinkD logs, and then an almost immediate bump back to 25%.
In normal operation mode, there does not seem to be any local control of the pump. It shows "Display Not Active". It does not necessarily mean that some older pre-automation programming is still in there that for some reason is ignoring the automation. I just disconnected the pump from the RS and cleared any old schedules from before the Aqualink was installed (set 1-4 to manual). Will report if any change.
I will think about the two options. Not sure I want to give up the read_all_devices . The swg_zero_ignore may be the best option if ultimately this causes issue.
I already have a hold down on my SWG 0% notifications for 5 minutes to handle any bounce or pump start/stops. So this is mostly an issue of unneeded state changes being tracked.
My 30 sec statement on the 0% part of the bounce, was an estimate at time by counting the number of messages. You have 228 total messages between the SWG 0 part of the bounce, each message is about 0.1 second apart, so 22.8 seconds. Crap, while typing that, I forgot I shouldn't use 228 messages, I should have used just the SWG messages of 28. So 2.8 seconds as a very rough guess. But that's going by time on how my system works, you're control panel is a lot newer & a different chipset, so maybe faster.
I think the swg_zero_ignore is the better option. I could also change read_all_devices to specific devices, so you could ignore SWG and keep the pump information.
BTW, you didn;t see this in some of the previous versions as I had code to completely ignore any "set SWG to 0" messages. That way you knew the SWG % when the pump was off. But some people want to actually set the SWG to 0, so I have to remove that code. I could also make that a config option, but you'd never be able to actually set the SWG to 0 if you used it.
Give me a few more days and I'll see if I can find the time to go over your other logs and every single pump messages, see if I can find out anything else.
Thanks!. Take your time. I have options around it if needed and this for now seems more cosmetic than anything causing problems. My hold downs have quieted down the notifications to almost nothing.
I think having the option to set the SWG to 0 is a useful in cases of adding salt, or shutting it off for the winter.
Something very interesting, right in the middle of the 0 bounce.
0x10|0x02|0x00|0x01|0x3f|0x00|0x52|0x10|0x03|
*1) 0x10|0x02|0x50|0x11|0x00|0x73|0x10|0x03|
0x10|0x02|0x00|0x16|0x25|0x00|0x00|0x00|0x4d|0x10|0x03|
0x10|0x02|0x80|0x00|0x92|0x10|0x03|
0x10|0x02|0x0a|0x02|0x00|0x14|0x00|0x00|0x00|0x32|0x10|0x03|
*2) 0x10|0x02|0x0a|0x03|0x00|0x20|0x20|0x41|0x71|0x75|0x61|0x50|0x75|0x72|0x65|0x20|0x32|0x35|0x25|0x20|0x20|0x6f|0x10|0x03|
0x10|0x02|0x60|0x00|0x72|0x10|0x03|
0x10|0x02|0x33|0x30|0x75|0x10|0x03|0x00|0x00|
0x10|0x02|0x00|0x01|0x00|0x00|0x13|0x10|0x03|
0x10|0x02|0xa3|0x53|0x08|0x10|0x03|0x00|0x00|
0x10|0x02|0x00|0x01|0x3f|0x00|0x52|0x10|0x03|
*3) 0x10|0x02|0x50|0x11|0x00|0x73|0x10|0x03|
0x10|0x02|0x00|0x16|0x25|0x00|0x00|0x00|0x4d|0x10|0x03|
Message #1 and #3 are Control panel telling the SWG to set the % to zero, message #2 is Control Panel sending AqualinkD an ascii message that states 'AquaPure 25% '. So not only is the control panel not telling RS8 keypad of the bounce, it's actively 'hiding it' from the RS8 keypad. Makes me think the control panel actually knows it's bouncing to 0 for a reason, and not a reset pump it didn't actually know about. Means there is some logic as to why it's doing this, just need to figure out why now. Also explains you not seeing the 0 for the full duration ~2.8 seconds, as AqualinkD would do the following. < All readings to this point 25> Read the RS485 packet and record SWG to 0 Read the RS485 packet and record SWG to 0 -- repeat above -- Read the RS8 message and record SWG to 25 Read the RS485 packet and record SWG to 0 Read the RS485 packet and record SWG to 0 -- repeat above -- Read the RS485 packet and record SWG to 25 < All readings to this point 25>
So one bounce to 0, would actually show 2 in MQTT.
Wait.. so the CP tells SWG to go to 0% then back up without any external commands telling it so, but then directly tells the RS8 "Nothing new here.. setting is at 25% just like you wanted".
Very odd, so either it is reacting to some other external signal that you have not seen yet or there is a specific reason in the AqualinkRS (either planned or unplanned) doing this and then catching it and resetting.
I think I still have those hourly (at almost exactly 60m) BAD packets show up in the logs (currently at close to 17 minutes pas the hour. No idea what is causing it or where it is from. I wonder if this is something specific to iAqualink or the Pentair. I may pull the iAquallink off the bus to see if that changes anything.
Yup, that’s what’s happening. As for the bad packets, I just read yours manually so didn’t do any checksum checks, I’ll run your raw logs through my parser to check, but if it’s just Pentair checksum errors and not Jandy, that could be my code. If it’s every hour, maybe the CP is putting the pump into local run mode for 1 hour, then when it needs to reset / pass another run for 1 command that’s what’s showing up. I’ll spend some more time going over everything when I get the chance, but I haven’t figured all the Pentair data out just yet, there are still a lot of unknowns on that pump protocol to me, and no more information on the net.
Another set of logs.
This is really interesting. For the 2nd day in a row, I was alerted that the Aquapure Pool setting in iAqualink was set to 0% for more than 5 minutes. I thought the day before I may have accidentally set it manually while testing, but absolutely sure I did not this time as my monitoring showed that the switch occurred while I was nowhere near my computer at 06:31:20 this morning. I also see several hourly bounces of the SWG enabled going to 0 and back to 2.
At ~6:31:15 there is a couple of bounces and a flurry of programming messages following Oct-15-19 06:31:15 AM Debug: Ignoring SWG set to 0 due to packet packet count 0 <= 0 from control panel to SWG 0x00 0x73.
Looking at the logs, there is a bunch of entries of what looks like bounces as well as where it looks like an RS menu is being walked through and then at ~6:31:20 it drops to 0 again and stays there. The iAqualink shows the Aquapure Pool setting is at 0.
For a couple of minutes after, I see the messages as something is scanning the menus (not sure what 'something' is) : Oct-15-19 06:32:25 AM Debug: Find item in Menu: loop 25 of 25 looking for 'SET AQUAPURE' received message 'Jandy AquaLinkRS'
The other issue is there seems to be a ' enabled' bounce now every hour at about 15 minutes past the hour. I noticed this via the aqualinkd/SWG/enabled in MQTT 2:14:50 3:15:09 4:15:06 5:15:11 (as well as several others between 5:07 - 5:32) 6:15:12 7:15:12 8:15:08 9:14:43
What is interesting is that I think the actual SWG setting was still 0 at 7:15, 8:15, and 9:14. Just the enabled setting was bouncing.
Well.. this is odd. After posting the last update, I restarted aqualinkD and 5 minutes later received a notice that my SWG was set to 0 for more than 5 minutes . (This time SPA since that mode was running).
Oct-15-19 10:29:46 AM Notice: Listening to Aqualink RS8 on serial port: /dev/ttyUSB0
Oct-15-19 10:29:46 AM Info: MQTT: Subscribing mqtt with id of: aqualinkd_b827eb7d
Oct-15-19 10:29:46 AM Info: MQTT: Connection acknowledged
Oct-15-19 10:29:46 AM Info: MQTT: Subscribing to 'aqualinkd/#'
Oct-15-19 10:29:46 AM Info: MQTT: Subscription(s) acknowledged
Oct-15-19 10:29:46 AM Info: MQTT: Published id=1: aqualinkd/Alive 1
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/Alive 1
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/Freeze_Protect 0
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/Freeze_Protect/setpoint 36.00
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/Freeze_Protect/enabled 1
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/Battery 1
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/SWG 2
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/SWG/fullstatus 0
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/SWG/enabled 2
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/SWG/PPM 3700
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/SWG/Percent 0
Oct-15-19 10:29:46 AM Debug: MQTT: topic aqualinkd/SWG/Percent/set 0
Oct-15-19 10:29:46 AM Notice: Setting setpoint of Salt Water Generator to 0
Oct-15-19 10:29:46 AM Debug: MQTT: Ignore aqualinkd/SWG/Percent_f -17.78
I have attached logs again. For some reason, the RS485raw was not in the /tmp directory this time. The only automation I have setting the percentage is a slider bar in home assistant. Nobody is using it, but I will disable it to see if it changes anything.