Viewing graphs can break when boost is running in some rare cases
Hey All,
I have found that on 1.2.22 in between boost runs some devices show gaps in the graph I traced the data from spine all the way to poller_output_boost
the data makes it way correctly until it hits the RRA write it seems that for some the data is not written The data is removed out of poller_output_boost after each run so its not stuck in there
No relevant errors in log For the affected devices I have rebuilt the poller cache no difference
FOund this when running boost manually
php poller_boost.php --verbose --debug --force >> /tmp/boost.txt
2022/10/03 14:33:40 - CMDPHP SQL Backtrace: (/poller_boost.php[222]:boost_output_rrd_data(), /poller_boost.php[588]:boost_process_local_data_ids(), /poller_boost.php[688]:db_fetch_assoc(), /lib/database.php[593]:db_fetch_assoc_prepared(), /lib/database.php[613]:db_execute_prepared())
2022/10/03 14:33:40 - CMDPHP ERROR: A DB Row Failed!, Error: Table 'cacti.poller_output_boost_arch_1664822004' doesn't exist
2022/10/03 14:33:40 - BOOST CHILD DEBUG: Processing 128 of 130 for Boost Process 1
2022/10/03 14:33:40 - CMDPHP SQL Backtrace: (/poller_boost.php[222]:boost_output_rrd_data(), /poller_boost.php[588]:boost_process_local_data_ids(), /poller_boost.php[688]:db_fetch_assoc(), /lib/database.php[593]:db_fetch_assoc_prepared(), /lib/database.php[613]:db_execute_prepared())
2022/10/03 14:33:40 - CMDPHP ERROR: A DB Row Failed!, Error: Table 'cacti.poller_output_boost_arch_1664822004' doesn't exist
2022/10/03 14:33:40 - BOOST CHILD DEBUG: Processing 127 of 130 for Boost Process 1
2022/10/03 14:33:40 - CMDPHP SQL Backtrace: (/poller_boost.php[222]:boost_output_rrd_data(), /poller_boost.php[588]:boost_process_local_data_ids(), /poller_boost.php[688]:db_fetch_assoc(), /lib/database.php[593]:db_fetch_assoc_prepared(), /lib/database.php[613]:db_execute_prepared())
2022/10/03 14:33:40 - CMDPHP ERROR: A DB Row Failed!, Error: Table 'cacti.poller_output_boost_arch_1664822004' doesn't exist
2022/10/03 14:33:40 - BOOST CHILD DEBUG: Processing 126 of 130 for Boost Process 1
2022/10/03 14:33:40 - CMDPHP SQL Backtrace: (/poller_boost.php[222]:boost_output_rrd_data(), /poller_boost.php[588]:boost_process_local_data_ids(), /poller_boost.php[688]:db_fetch_assoc(), /lib/database.php[593]:db_fetch_assoc_prepared(), /lib/database.php[613]:db_execute_prepared())
2022/10/03 14:33:40 - CMDPHP ERROR: A DB Row Failed!, Error: Table 'cacti.poller_output_boost_arch_1664822004' doesn't exist
2022/10/03 14:33:40 - BOOST CHILD DEBUG: Processing 125 of 130 for Boost Process 1
Output from debug file
DEBUG: Checking if Boost is ready to run.
DEBUG: Last Runtime was 2022-10-03 14:08:27 (1664820507).
DEBUG: Next Runtime is 2022-10-03 15:08:27 (1664824107).
DEBUG: Records Found:6717232, Max Threshold:7000000.
DEBUG: Time to Run Boost, Force Run is true!
DEBUG: Parallel Process Setup Begins.
DEBUG: Data Sources:89253, Concurrent Processes:1
DEBUG: Parallel Process Setup Complete. Ready to spawn children.
DEBUG: About to launch 1 processes.
DEBUG: Launching Boost Process Number 1
Total[1.4670] DEBUG: About to Spawn a Remote Process [CMD: /bin/php, ARGS: /var/www/html/cacti/poller_boost.php --child=1 --debug]
DEBUG: 1 Processes Running, Sleeping for 2 seconds.
Boost tables are clean according to audit_database
bash-4.2$ php audit_database.php --report | grep boost Checking Table: 'poller_output_boost' - Clean Checking Table: 'poller_output_boost_local_data_ids' - Clean Checking Table: 'poller_output_boost_processes' - Clean bash-4.2$
OK, so when I ran boost the first time manually I think I collided with cacti running it? rerunning it manually seems fine but the strange thing is the first time I ran it the boost table went empty
Test now @bmfmancini
So far so good @TheWitness will let it soak for a bit and let you know
@TheWitness unfortunately still seeing gaps in plotting
confirmed its only after a graph has been viewed and boost runs afterwards
Any errors in the log?
No error at all
On Fri., Oct. 7, 2022, 5:09 p.m. TheWitness, @.***> wrote:
Any errors in the log?
— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/4941#issuecomment-1272088421, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTGBFSJ5UXRHI25BUHLWCCGPNANCNFSM6AAAAAAQ32N42A . You are receiving this because you were mentioned.Message ID: @.***>
Well, that's good. Now you have to find the real reason. How many poller items for the device in question?
Various amounts of poller items and different templates
Alot of them have no commonality device type wise and it at random times but always seems to be associated with a boost run
On Sat., Oct. 8, 2022, 11:56 a.m. TheWitness, @.***> wrote:
Well, that's good. Now you have to find the real reason. How many poller items for the device in question?
— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/4941#issuecomment-1272347294, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTBHWD6N5HE26L3Q2M3WCGKRZANCNFSM6AAAAAAQ32N42A . You are receiving this because you were mentioned.Message ID: @.***>
You need to very specific. If there is more than one device, give me a count for each.
Ok will do
On Sat., Oct. 8, 2022, 12:33 p.m. TheWitness, @.***> wrote:
You need to very specific. If there is more than one device, give me a count for each.
— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/4941#issuecomment-1272354048, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTH2K3YKZOFHPZE7MO3WCGO55ANCNFSM6AAAAAAQ32N42A . You are receiving this because you were mentioned.Message ID: @.***>
What RRDtool version?
I have another theory...
However, you need to answer the poller items question for a few of the cases.
Rrd version is 1.4
On Mon., Oct. 10, 2022, 5:55 a.m. TheWitness, @.***> wrote:
However, you need to answer the poller items question for a few of the cases.
— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/4941#issuecomment-1273065252, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTGAQXGZ57FOTJDIRGDWCPR2PANCNFSM6AAAAAAQ32N42A . You are receiving this because you were mentioned.Message ID: @.***>
Upgrade to 1.8
Ok updated to rrdtool 1.8
RRDtool 1.8.0 Copyright by Tobias Oetiker <[email protected]>
Compiled Oct 11 2022 11:19:31
Gaps are still being seen after viewing a graph the data for that time period is removed from the poller_output_boost table sometimes the rra is updated without problem but others the graph will show a large gap
while checking for data the poller_output_boost table will have entries in it for that data source and they will disappear from the table while the graph still shows a gap however on the next boost run the graph will start to plot again but only with the data that populated in the table since its been viewed
Here are my steps
1.) View poller_output_boost table
MariaDB [cacti]> select * from poller_output_boost where local_data_id = '67278' \G
*************************** 1. row ***************************
local_data_id: 67278
rrd_name: discards_in
time: 2022-10-11 13:18:02
output: 0
*************************** 2. row ***************************
local_data_id: 67278
rrd_name: discards_out
time: 2022-10-11 13:18:02
output: 0
*************************** 3. row ***************************
local_data_id: 67278
rrd_name: errors_in
time: 2022-10-11 13:18:02
output: 0
*************************** 4. row ***************************
local_data_id: 67278
rrd_name: errors_out
time: 2022-10-11 13:18:02
output: 0
4 rows in set (0.001 sec)
2.) View the graph

3.) Check poller_output_boost table entries will be removed for the timespan you are viewing except for new polled data
MariaDB [cacti]> select * from poller_output where local_data_id = '67278';
Empty set (0.000 sec)
MariaDB [cacti]> select * from poller_output where local_data_id = '67278';
Empty set (0.000 sec)
MariaDB [cacti]> select * from poller_output_boost where local_data_id = '67278';
Empty set (0.000 sec)
MariaDB [cacti]> select * from poller_output_boost where local_data_id = '67278';
+---------------+--------------+---------------------+--------+
| local_data_id | rrd_name | time | output |
+---------------+--------------+---------------------+--------+
| 67278 | discards_in | 2022-10-11 13:23:02 | 0 |
| 67278 | discards_out | 2022-10-11 13:23:02 | 0 |
| 67278 | errors_in | 2022-10-11 13:23:02 | 0 |
| 67278 | errors_out | 2022-10-11 13:23:02 | 0 |
+---------------+--------------+---------------------+--------+
4 rows in set (0.001 sec)
Graph will still show the gap until boost run but only the newly polled data will make it to the rra the other data will be lost

oops forgot to add the poller count for this example device is 12
@TheWitness is the above info all what you were looking for ?
Okay, bug was confirmed and this is resolved now.
This is still broken when boost is running. Looking to get a fix together.
@bmfmancini I'm going to mark this resolved. If after updating tomorrow to the latest in test, you find issues, we can re-open.