at_server icon indicating copy to clipboard operation
at_server copied to clipboard

Secondaries needing memory bump when a second atClient connects

Open cconstab opened this issue 3 years ago • 5 comments

Describe the bug Very odd but has now happened twice, with different atSigns (kryz_9580 & visual61). Everything has been working 100% reliably with a single dart program connecting to the secondary, but when a second atClient codebase connects the secondary it crashes with a lack of memory.

cconstab@swarm0002-01:~$ docker service ps --no-trunc 2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary
ID                          NAME                                                   IMAGE                                                                                                                            NODE
                                DESIRED STATE   CURRENT STATE                ERROR                         PORTS
urv1zntztpud8q04xfj8edk2m   2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary.1       reg.swarm0001.atsign.zone/atsigncompany/secondary:prod@sha256:87898a822e95288c7f57fcb9ed1e9627e56a7b4633c2f480ca3ba0724a5780db   swarm0002-20.us-central1-b.c.secondaries.internal   Running         Running about a minute ago
yst605cyiuja1m3bggi6s8mcn    \_ 2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary.1   reg.swarm0001.atsign.zone/atsigncompany/secondary:prod@sha256:87898a822e95288c7f57fcb9ed1e9627e56a7b4633c2f480ca3ba0724a5780db   swarm0002-15.us-central1-c.c.secondaries.internal   Shutdown        Failed about a minute ago    "task: non-zero exit (137)"
4045lw4js66denia0omzimlb4    \_ 2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary.1   reg.swarm0001.atsign.zone/atsigncompany/secondary:prod@sha256:87898a822e95288c7f57fcb9ed1e9627e56a7b4633c2f480ca3ba0724a5780db   swarm0002-20.us-central1-b.c.secondaries.internal   Shutdown        Failed 2 minutes ago         "task: non-zero exit (137)"
ynlvlmgkb9pdqdtzmqqd1x43i    \_ 2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary.1   reg.swarm0001.atsign.zone/atsigncompany/secondary:prod@sha256:87898a822e95288c7f57fcb9ed1e9627e56a7b4633c2f480ca3ba0724a5780db   swarm0002-06.us-central1-c.c.secondaries.internal   Shutdown        Failed 4 minutes ago         "task: non-zero exit (137)"
ypopet2rfur3yysq7l4v38nid    \_ 2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary.1   reg.swarm0001.atsign.zone/atsigncompany/secondary:prod@sha256:87898a822e95288c7f57fcb9ed1e9627e56a7b4633c2f480ca3ba0724a5780db   swarm0002-15.us-central1-c.c.secondaries.internal   Shutdown        Failed 6 minutes ago         "task: non-zero exit (137)"
cconstab@swarm0002-01:~$ docker service update --limit-memory 100M 2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary
2ffeca71-aaad-52d1-ba55-9173d5504f1f_secondary
overall progress: 1 out of 1 tasks
1/1: running   [==================================================>]
verify: Service converged
cconstab@swarm0002-01:~$

To Reproduce Steps to reproduce the behavior:

  1. First I created a secondary
  2. Then sent notifications via that secondary
  3. And then connected a second client to that secondary
  4. The secondary will then crash with memory starvation
  5. upgrade to 100M and things are back to normal..

Expected behavior I would not expect a second or additional atClients connecting to use significantly more memory and crash the secondary

Screenshots If applicable, add screenshots to help explain your problem.

**Dart code used at the time ** https://github.com/cconstab/at_nautel_snmp

Were you using an @‎application when the bug was found? see above

Additional context Add any other context about the problem here.

cconstab avatar Jul 12 '22 23:07 cconstab

Moving to PR43

gkc avatar Aug 08 '22 14:08 gkc

Investigating this issue. Testing the issue with the code changes done for https://github.com/atsign-foundation/at_server/issues/856.

@cconstab I recon, originally both the secondaries are running with 50MB. How long it take before the crash?

VJag avatar Aug 23 '22 10:08 VJag

Investigating this issue. Testing the issue with the code changes done for https://github.com/atsign-foundation/at_server/issues/856.

@cconstab I recon, originally both the secondaries are running with 50MB. How long it take before the crash?

About a second or two after the second client connects.. This is when the secondary is limited to 50mb.. To be clear the secondary does not crash as such docker stops it as memory limits are hit.

cconstab avatar Aug 25 '22 14:08 cconstab

Any update during PR45, @VJag ?

gkc avatar Sep 19 '22 12:09 gkc

Memory optimisation fixes took care of this issue too. @cconstab can you please retest and close the issue.

VJag avatar Sep 19 '22 13:09 VJag

Closed

cconstab avatar Jul 22 '23 08:07 cconstab