cacti icon indicating copy to clipboard operation
cacti copied to clipboard

Increasing the number of processes to a critical level in version 1.2.27

Open Ponomarenko50 opened this issue 1 year ago • 13 comments

Describe the bug

After the transition from version 1.2.25 to version 1.2.27 in the system, the number of running processes began to gradually increase (see the screenshots). This problem was observed in version 1.2.26.

To Reproduce

Steps to reproduce the behavior:

  1. Go to '...'

  2. Click on '....'

  3. Scroll down to '....'

  4. See error

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

CactiProcesses-v 1 2 27 CactiMemoryUsage-v 1 2 27

About screenshots: Until the time of 20:30, the Cacti version 1.2.25 worked. At 20:30, version 1.2.27 was installed.

Desktop (please complete the following information):

  • OS: Ubuntu Server 18.04.4 LTS

  • Browser Mozilla Forefox

  • Cacti version 1.2.27, spine version 1.2.27, MySQL version 5.7.42-0ubuntu0.18.04.1

Smartphone (please complete the following information)

  • Device:

  • OS: [e.g. iOS8.1]

  • Browser [e.g. stock browser, safari]

  • Version [e.g. 22]

Additional context

Ponomarenko50 avatar May 14 '24 02:05 Ponomarenko50

Do you have a list of what process is sticking around?

You can run for example

ps - ef | grep php ps - ef | grep spine

Any errors in the cacti log?

We will need this info to point to a cause

On Mon, May 13, 2024, 22:55 Ponomarenko50 @.***> wrote:

Describe the bug

After the transition from version 1.2.25 to version 1.2.27 in the system, the number of running processes began to gradually increase (see the screenshots). This problem was observed in version 1.2.26. To Reproduce

Steps to reproduce the behavior:

Go to '...' 2.

Click on '....' 3.

Scroll down to '....' 4.

See error

Expected behavior

A clear and concise description of what you expected to happen. Screenshots

CactiProcesses-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/77bb8c2d-b86b-4f38-aa0b-aee99005cb42 CactiMemoryUsage-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/32a1323a-8749-4853-824e-b34012a63da5

About screenshots: Until the time of 20:30, the Cacti version 1.2.25 worked. At 20:30, version 1.2.27 was installed. Desktop (please complete the following information):

OS: Ubuntu Server 18.04.4 LTS

Browser Mozilla Forefox

Cacti version 1.2.27, spine version 1.2.27, MySQL version 5.7.42-0ubuntu0.18.04.1

Smartphone (please complete the following information)

Device:

OS: [e.g. iOS8.1]

Browser [e.g. stock browser, safari]

Version [e.g. 22]

Additional context

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5748, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTD5YJKDWHTH2XVHFTDZCF4JHAVCNFSM6AAAAABHVJTXAKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4TIMRUGYZDGNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

bmfmancini avatar May 14 '24 03:05 bmfmancini

Hello.

From: Sean Mancini @.> Sent: Tuesday, May 14, 2024 1:42 PM To: Cacti/cacti @.> Cc: Ponomarenko50 @.>; Author @.> Subject: Re: [Cacti/cacti] Increasing the number of processes to a critical level in version 1.2.27 (Issue #5748)

Do you have a list of what process is sticking around?

You can run for example

ps - ef | grep php ps - ef | grep spine

Any errors in the cacti log?

We will need this info to point to a cause

On Mon, May 13, 2024, 22:55 Ponomarenko50 @.***> wrote:

Describe the bug

After the transition from version 1.2.25 to version 1.2.27 in the system, the number of running processes began to gradually increase (see the screenshots). This problem was observed in version 1.2.26. To Reproduce

Steps to reproduce the behavior:

Go to '...' 2.

Click on '....' 3.

Scroll down to '....' 4.

See error

Expected behavior

A clear and concise description of what you expected to happen. Screenshots

CactiProcesses-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/77bb8c2d-b86b-4f38-aa0b-aee99005cb42 CactiMemoryUsage-v.1.2.27.png (view on web) https://github.com/Cacti/cacti/assets/169735569/32a1323a-8749-4853-824e-b34012a63da5

About screenshots: Until the time of 20:30, the Cacti version 1.2.25 worked. At 20:30, version 1.2.27 was installed. Desktop (please complete the following information):

OS: Ubuntu Server 18.04.4 LTS

Browser Mozilla Forefox

Cacti version 1.2.27, spine version 1.2.27, MySQL version 5.7.42-0ubuntu0.18.04.1

Smartphone (please complete the following information)

Device:

OS: [e.g. iOS8.1]

Browser [e.g. stock browser, safari]

Version [e.g. 22]

Additional context

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5748, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTD5YJKDWHTH2XVHFTDZCF4JHAVCNFSM6AAAAABHVJTXAKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4TIMRUGYZDGNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, https://github.com/Cacti/cacti/issues/5748#issuecomment-2109218712 view it on GitHub, or https://github.com/notifications/unsubscribe-auth/BIO7LENQCXFIOFHW7J23D5DZCGBXPAVCNFSM6AAAAABHVJTXAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBZGIYTQNZRGI unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Ponomarenko50 avatar May 14 '24 04:05 Ponomarenko50

The server "fell", the database is destroyed, there is no way to log in. The server is restored to version 1.2.25.

Ponomarenko50 avatar May 14 '24 06:05 Ponomarenko50

Cacti logs have messages: .... 14/May/2024 16:24:02 - SYSTEM MAINT STATS: Time:0.01 ***** 14/May/2024 16:24:02 - POLLER: Poller[1] PID[23679] WARNING: **### There are 1 processes detected as overrunning a polling cycle, please investigate ****** 14/May/2024 16:24:02 - SYSTEM WARNING: Primary Admin account notifications disabled! Unable to send administrative Email. 14/May/2024 16:25:01 - POLLER: Poller[1] PID[23679] Maximum runtime of 58 seconds exceeded. Exiting. 14/May/2024 16:25:01 - SYSTEM WARNING: Primary Admin account notifications disabled! Unable to send administrative Email. 14/May/2024 16:25:01 - SYSTEM STATS: Time:59.1008 Method:spine Processes:25 Threads:20 Hosts:282 HostsPerProcess:12 DataSources:6270 RRDsProcessed:0 ***** 14/May/2024 16:25:02 - POLLER: Poller[1] PID[26200] WARNING: There are 1 processes detected as overrunning a polling cycle, please investigate ****** ...

An increase in the number of processes/flows (25/25) does not lead to changes. With the established version 1.2.25 there were no such messages.

Ponomarenko50 avatar May 14 '24 07:05 Ponomarenko50

That's a lot of processes for the database to handle. My system has significantly more than 10k hosts and I run with only 5 processes and 30 threads. My database server has 112 threads to keep pace. Your settings are overloading the system.

How many database server cores do you have? How many web server cores?

With the poller shut down, run spine in debug as follows and post the results

./spine -V 3 -S -R

TheWitness avatar May 14 '24 12:05 TheWitness

Hello. The Cacti system itself proposed to increase the number of flows (see the log higher). I increased, but nothing has changed. How many flows on the host, the total number of processes/flows (for Spine) do you recommend installing?

My server is a regular computer with a 4 core processor and 8 GB of RAM.

I replaced the Spine version 1.2.27 with version 1.2.25 (version of Cacti 1.2.27), the problem of increasing the number of processes has gone (see screenshots), there are no errors in the cacti logs, there are no errors in cacti_stderr.log.

There are messages in cacti_stderr.log: ... Replacing action 3 Replacing action 4 Replacing action 6 Replacing action 7 Replacing action 2 ... This is fine?

CactiProcesses-v 1 2 27-2 CactiMemoryUsage-v 1 2 27-2

Ponomarenko50 avatar May 14 '24 23:05 Ponomarenko50

That's a lot of processes for the database to handle. My system has more than 10k hosts and I run with only 5 processes and 30 threads. My database server has 112 threads to keep pace. Your settings are overloading the system.

How many database server cores do you have? How many web server cores?

With the poller shut down, run spine in debug as follows and post the results

./spine -V 3 -S -R

TheWitness avatar May 14 '24 23:05 TheWitness

Good afternoon. I can’t provide you with a spine log. It has a lot of confidential information. There are no error messages in the log file, at the last stage these messages are displayed:

Total[24.4489] Device[606] HT[2] DS[11922] TT[998.40] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.1, value: 783007776 Total[24.4489] Device[606] HT[2] DS[11922] TT[998.41] SNMP: v2: 192.168.98.94, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.1, value: 44954243 Total[24.4489] Device[606] HT[2] DS[11923] TT[998.42] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.2, value: 53318538 Total[24.4489] Device[606] HT[2] DS[11923] TT[998.43] SNMP: v2: 192.168.98.94, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, value: 800089969 Total[24.4489] Device[606] HT[2] DS[11924] TT[998.44] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.7, value: 85059471 Total[24.4490] Device[606] HT[2] DS[11924] TT[998.46] SNMP: v2: 192.168.98.94, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.7, value: 1579091582 Total[24.4490] Device[606] HT[2] DS[11925] TT[998.47] SNMP: v2: 192.168.98.94, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.8, value: 36938751 Total[24.4490] Device[606] HT[2] Total Time: 4.3 Seconds corrupted size vs. prev_size 2024/05/17 15:38:57 - FATAL: Spine Interrupted by Abort Signal Aborted (core dumped) xxxxxx@cacti_server:/usr/local/spine/bin# Unable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipeUnable to flush stdout: Broken pipe^C

These messages are duplicated in the cacti-stderr.log log-file cacti_stderr.log

Ponomarenko50 avatar May 17 '24 06:05 Ponomarenko50

Can you provide the entirety of the output of the spine run? What is your MariaDB/MySQL setting of max_connections and max_used_connections

SHOW GLOBAL VARIABLES LIKE 'max_connections';
SHOW GLOBAL STATUS LIKE 'max_used_connections';

Those "Replacing action X", do not appear to be in the mainline Cacti. Can you research that?

cd /var/www/html/cacti
grep -r "Replacing"

TheWitness avatar May 18 '24 13:05 TheWitness

Hello.

mysql> SHOW GLOBAL VARIABLES LIKE 'max_connections'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | max_connections | 850 | +-----------------+-------+ 1 row in set (0,00 sec)

mysql> SHOW GLOBAL STATUS LIKE 'max_used_connections'; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | Max_used_connections | 550 | +----------------------+-------+ 1 row in set (0,00 sec)

mysql>

Message "Replacing Action X" is not in Cacti, it is located in Spine (only in version 1.2.25) in the error.c file. The problem of increasing the number of connections is not in Cacti, the problem in Spine. He, under certain conditions, does not close the compounds.

Ponomarenko50 avatar May 21 '24 08:05 Ponomarenko50

You should sanitize it and then send a long to TheWitness at Cacti dot net.

TheWitness avatar May 21 '24 14:05 TheWitness

No feedback

TheWitness avatar Jun 16 '24 15:06 TheWitness