cacti icon indicating copy to clipboard operation
cacti copied to clipboard

Boost - confuses indexes

Open gadzet21 opened this issue 1 year ago • 13 comments

Hi, After updating Cacti from 1.2.17 to 1.2.27, I noticed strange jumps in the charts, and only when a given chart opens. Interestingly, not everywhere (not on every templates).

To check whether the current Cacti has no errors after the update - I installed a fresh version of Cacti 1.2.27 on a separate server (Debian). Spine, boost enabled. Standard plugins imported, localhost added. The error is the same and can be seen in the Main Poller - Collector Items chart. If no one is logged in to Cacti, data is collected, as can be seen below: image

But if I log in and view the chart - the indexes change their place! image

And the headlines: image

The chart itself: image

Logs: 2024-06-13 09:29:02 - SPINE: Poller[Main Poller] PID[138801] PT[140017500419776] Device[Localhost] HT[1] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.21] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getSNMP' '1', output: 0 2024-06-13 09:29:02 - SPINE: Poller[Main Poller] PID[138801] PT[140017500419776] Device[Localhost] HT[1] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.19] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getScriptServer' '1', output: 16 2024-06-13 09:29:02 - SPINE: Poller[Main Poller] PID[138801] PT[140017500419776] Device[Localhost] HT[1] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.45] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getScript' '1', output: 3 ... 2024-06-13 10:03:38 - BOOST The RRDpath template is getScript:getScriptServer:getSNMP 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_getsnmp_23.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 23, Sorted Results: 9 2024-06-13 10:03:38 - BOOST Local Data ID: 23, Archive Results: 0, Boost Results: 9 2024-06-13 10:03:38 - BOOST The RRDpath template is uactive:usleeping 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_active_10.rrd 2024-06-13 10:03:38 - BOOST The RRDpath template is script:script_server:snmp 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_snmp_4.rrd 2024-06-13 10:03:38 - BOOST The RRDpath template is active:garbage:sleeping 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_active_9.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 10, Sorted Results: 4 2024-06-13 10:03:38 - BOOST Local Data ID: 10, Archive Results: 0, Boost Results: 4 2024-06-13 10:03:38 - BOOST Local Data ID: 4, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 4, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 9, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 9, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST The RRDpath template is hosts:time 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_hosts_5.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 5, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 5, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST The RRDpath template is invalid:valid 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_active_8.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 8, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 8, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST The RRDpath template is Processes:Threads 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_rrdsprocessed_6.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 6, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 6, Archive Results: 0, Boost Results: 3

gadzet21 avatar Jun 13 '24 08:06 gadzet21

I teresting I have not noticed any issues but I will start digging into this in the am

On Thu, Jun 13, 2024, 06:08 gadzet21 @.***> wrote:

php /usr/share/cacti/site/poller_boost.php --verbose --debug --force DEBUG: Checking if Boost is ready to run. DEBUG: Last Runtime was 2024-06-13 09:47:33 (1718272053). DEBUG: Next Runtime is 2024-06-13 11:47:33 (1718279253). DEBUG: Records Found:214, Max Threshold:1000000. DEBUG: Time to Run Boost, Force Run is true! DEBUG: Parallel Process Setup Begins. DEBUG: Data Sources:17, Concurrent Processes:1 DEBUG: Parallel Process Setup Complete. Ready to spawn children. DEBUG: About to launch 1 processes. DEBUG: Launching Boost Process Number 1 Total[7.0214] DEBUG: About to Spawn a Remote Process [CMD: /usr/bin/php, ARGS: /usr/share/cacti/site/poller_boost.php --child=1 --debug] 2024-06-13 10:05:04 - SYSTEM BOOST STATS: Time:9.03 RRDUpdates:214 2024-06-13 10:05:04 - SYSTEM BOOST DETAIL STATS: RRDUpdates:214, TotalTime:9.03, get_records:0, results_cycle:0, rrd_filename_and_template:0, rrd_lastupdate:0, rrdupdate:0, delete:0 DSSTATS: About to launch 1 processes. DSSTATS: Launching DSStats Process Number 1 for Type bmaster Total[9.0382] DEBUG: About to Spawn a Remote Process [CMD: /usr/bin/php, ARGS: /usr/share/cacti/site/poller_dsstats.php --type=bchild --child=1 --debug] DSSTATS: Processing daily for Thread 0 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 179 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 180 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 181 DSSTATS: DAILY 2024-06-13 10:05:06 - SYSTEM DSSTATS STATS: Time:11.08 Type:DAILY Threads:1 RRDfiles:38 DSSes:50 RRDUser:0.00 RRDSystem:0.00 RRDReal:0.00

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5771#issuecomment-2165207418, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTCBH3RSXRTIWBQOEODZHFVRHAVCNFSM6AAAAABJH44SF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRVGIYDONBRHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

bmfmancini avatar Jun 14 '24 02:06 bmfmancini

I have not been able to reproduce this

bmfmancini avatar Jun 14 '24 19:06 bmfmancini

It could be that the ifName is not the primary key and a reindex was done causing a mixup. You might want to reindex.

TheWitness avatar Jun 15 '24 22:06 TheWitness

I installed another Cacti on another server. I started with version 1.2.19 to check if the above-mentioned problem occurs. After switching from version 1.2.22 to 1.2.23 the following occurred:

image

I'm currently checking what change could have caused this. @bmfmancini did you have it running "Enable On-demand RRD Updating" ? @TheWitness I did a reindex, deleted the graph and created it again

gadzet21 avatar Jun 18 '24 13:06 gadzet21

I have remote collectors so boost is enabled by default

Is this happening with all graph types?

bmfmancini avatar Jun 18 '24 13:06 bmfmancini

No, in production after updating Cacti, the problem occurs only in 2 templates:

  • own: Get SNMP Data-Indexed
  • standard: Cacti stats - Data Collector stats - Poller Items( Get Script Server Data Indexed).

As for Poller Items - do your data in the database match those presented in the chart? For me, when boost is running, it presents the data incorrectly (not in this order): POLLER: Poller[Main Poller] PID[160779] Device[localhost] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.19] SERVER: /usr/share/cacti/scripts/ss_cpoller.php ss_cpoller 'get' 'getScriptServer' '1', output: 16 POLLER: Poller[Main Poller] PID[160779] Device[localhost] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.25] SERVER: /usr/share/cacti/scripts/ss_cpoller.php ss_cpoller 'get' 'getScript' '1', output: 3 POLLER: Poller[Main Poller] PID[160779] Device[localhost] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.7] SERVER: /usr/share/cacti/scripts/ss_cpoller.php ss_cpoller 'get' 'getSNMP' '1', output: 0

Graph: image

If I turn it off "On-demand RRD Updating": image

gadzet21 avatar Jun 18 '24 13:06 gadzet21

Are there any errors in the Cacti.log? What it outputs in those POLLER entries you showed above is how it is stored into the database. If you are saying that the output from the POLLER output above is what is wrong, then I'm left scratching my head.

TheWitness avatar Jun 20 '24 00:06 TheWitness

Take the Cacti Stats device and enable debug for it and then wait for a bunch of stuff to start appearing in the Cacti log. See if it ever echos to the Cacti.log the wrong information.

TheWitness avatar Jun 20 '24 00:06 TheWitness

Data is collected correctly in the database (poller_output_boost).

I placed an additional line for logging data in the boost_rrdtool_function_update function in the lib/boost.php file:

if ($valid_entry) {
	if ($rrd_update_template != '') {
		rrdtool_execute("update $rrd_path $update_options --template $rrd_update_template $rrd_update_values", false, RRDTOOL_OUTPUT_STDOUT, $rrdtool_pipe, 'BOOST');
		cacti_log("update $rrd_path $update_options $rrd_update_template $rrd_update_values", false, 'BOOST', POLLER_VERBOSITY_MEDIUM); //ADD
	} else {
		cacti_log("update $rrd_path $update_options $rrd_update_template $rrd_update_values", false, 'BOOST', POLLER_VERBOSITY_MEDIUM);
		rrdtool_execute("update $rrd_path $update_options $rrd_update_values", false, RRDTOOL_OUTPUT_STDOUT, $rrdtool_pipe, 'BOOST');
		}
	return 'OK';
}

If boost itself (automatically) updates the data after exceeding the conditions agreed in the configuration, the result in the logs is as follows: 2024-06-20 09:29:10 - BOOST update /usr/share/cacti/rra/localhost_getsnmp_23.rrd --skip-past-updates getScript:getScriptServer:getSNMP 1718866741:1:9:0 1718866801:1:9:0 1718866862:1:9:0 1718866921:1:9:0 1718866982:1:9:0 1718867042:1:9:0 1718867101:1:9:0 1718867162:1:9:0 1718867221:1:9:0 1718867281:1:9:0 1718867342:1:9:0 1718867401:1:9:0 1718867462:1:9:0 1718867522:1:9:0 1718867581:1:9:0 1718867642:1:9:0 1718867701:1:9:0 1718867761:1:9:0 1718867822:1:9:0 1718867881:1:9:0 1718867942:1:9:0 1718868002:1:9:0 1718868061:1:9:0 1718868122:1:9:0 1718868181:1:9:0 1718868241:1:9:0 1718868302:1:9:0 1718868361:1:9:0 1718868422:1:9:0 1718868481:1:9:0 1718868541:1:9:0

But if I go to graph_view.php where the graph is located, the logs show: 2024-06-20 09:44:28 - BOOST update /usr/share/cacti/rra/localhost_getsnmp_23.rrd --skip-past-updates getScript:getScriptServer:getSNMP 1718869441:0:1:9

As you can see, the results themselves have been sorted and do not match the names. I wonder why this doesn't happen everywhere... I pursue the matter further.

gadzet21 avatar Jun 20 '24 07:06 gadzet21

If you turn off boost does the graph plot correctly?

TheWitness avatar Jun 20 '24 10:06 TheWitness

I just read that post so it's the independent graphing between boost updates that scrambles the results that's good to know I think we can fix it.

TheWitness avatar Jun 20 '24 10:06 TheWitness

I think I have a lead at this point.

In the lib/boost.php file, the boost_process_poller_output function sorts the value and names with the command: $sorted = boost_array_orderby($results, 'timestamp', SORT_ASC, 'rrd_name', SORT_ASC); However, later the names are downloaded again with the command: $path_template = boost_get_rrd_filename_and_template($local_data_id); and they are no longer sorted in the same way as in the $sorted variable So we have a discrepancy: the values ​​of $outbuf and $rrd_tmpl

gadzet21 avatar Jun 20 '24 14:06 gadzet21

I already know the details :) The boost_array_orderby function uses array_multisort ASC sorting and the boost_get_rrd_filename_and_template function uses order by ASC sorting. The difference is in the sorting of character sizes - uppercase and lowercase. boost_array_orderby uses the php array_multisort function which is case sensitive (according to ASCII) boost_get_rrd_filename_and_template uses mysql order by ASC which is case insensitive

Change proposal: In the boost_array_orderby function instead of: $tmp[$key] = $row[$field]; change to: $tmp[$key] = strtolower($row[$field]);

PS, it may also be worth paying attention in the script to how the collation value (string comparison method) is set in the database.

gadzet21 avatar Jun 21 '24 11:06 gadzet21

@gadzet21, I should be doing some Cacti things this weekend. I'm just finishing up a huge update to Flowview. Still in QA but way cool.

TheWitness avatar Jul 18 '24 21:07 TheWitness

Sorry, been busy. Working on Cacti now if you are watching the commits.

TheWitness avatar Aug 08 '24 22:08 TheWitness

Finally got around to reviewing this one in detail. It's fixed. You just need to pull lib/boost.php from the 1.2.x branch. It's also in develop.

TheWitness avatar Aug 25 '24 01:08 TheWitness

Thanks for your patience @gadzet21! Report back once you've confirmed things.

TheWitness avatar Aug 25 '24 01:08 TheWitness

Okay @gadzet21 and @bmfmancini, I've had to tweak this. You need to update lib/boost.php and try again. I'm pretty sure it's safe, the new changes.

TheWitness avatar Sep 12 '24 17:09 TheWitness