carbon-c-relay icon indicating copy to clipboard operation
carbon-c-relay copied to clipboard

Performance Issue: high sys CPU usage on the nanosleep syscall

Open toni-moreno opened this issue 7 years ago • 2 comments

Hi @grobian , We have notice that our carbon-c-relay (v2.6 ´-2017-01-26-) on my RHEL6.6 is consuming lot of user and system cpu , read (~60%) and nanosleep (~30%). Read is spending user-cpu and nanosleep as sys-cpu .

It seems like nanosleep has a strange behaviour (https://stackoverflow.com/questions/1125297/nanosleep-high-cpu-usage, https://stackoverflow.com/questions/16726872/cpu-high-usage-of-the-usleep-on-cent-os-6-3)

Perhaps could be good if these "sleep" times could be removed allowing threads block until data available, in the meanwhile cpu time would be used by other processes/threads.

#.root:~ > top
top - 15:36:51 up 46 days, 11:00,  2 users,  load average: 9.95, 8.72, 8.14
Tasks: 286 total,   1 running, 285 sleeping,   0 stopped,   0 zombie
Cpu(s): 42.7%us, 30.1%sy,  0.0%ni, 25.8%id,  0.0%wa,  0.1%hi,  1.3%si,  0.0%st
Mem:  12190152k total,  7435500k used,  4754652k free,   696264k buffers
Swap:  2031612k total,        0k used,  2031612k free,  4446144k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12258 carbon-c  20   0 3046m 197m  896 S 317.7  1.7  31631:29 carbon-c-relay
48583 apache    20   0 1141m 125m 5500 S 59.4  1.1   0:29.97 httpd.worker
50521 apache    20   0  629m  64m 4924 S 19.2  0.5   0:00.58 httpd.worker
2237 carbon    20   0  308m  21m 1728 S 10.9  0.2   4355:28 carbon-cache.py
49460 apache    20   0  822m  87m 6424 S  5.3  0.7   0:09.95 httpd.worker


#.root:~ > strace -f -c -p 12258
Process 12258 attached with 43 threads - interrupt to quit
^CProcess 12258 detached
Process 12260 detached
Process 12261 detached
Process 12262 detached
Process 12263 detached
Process 12264 detached
Process 12265 detached
Process 12266 detached
Process 12267 detached
Process 12268 detached
Process 12269 detached
Process 12270 detached
Process 12271 detached
Process 12272 detached
Process 12273 detached
Process 12274 detached
Process 12275 detached
Process 12276 detached
Process 12277 detached
Process 12278 detached
Process 12279 detached
Process 12280 detached
Process 12281 detached
Process 12282 detached
Process 12283 detached
Process 12284 detached
Process 12285 detached
Process 12286 detached
Process 12287 detached
Process 12288 detached
Process 12289 detached
Process 12290 detached
Process 12291 detached
Process 12292 detached
Process 12293 detached
Process 12294 detached
Process 12295 detached
Process 12296 detached
Process 12297 detached
Process 12298 detached
Process 12299 detached
Process 12300 detached
Process 12301 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
59.02 1861.055146         301   6177866   5997479 read
27.41  864.251631      174069      4965           nanosleep
  9.39  295.981045         318    929635           write
  1.48   46.592231         314    148396           mprotect
  0.64   20.276546         441     46002           poll
  0.48   15.035243         327     45942           close
  0.44   13.768712         299     46001           accept
  0.42   13.320914         290     46006           fcntl
  0.41   12.951691         282     46000           getpeername
  0.24    7.700703         321     24014      6211 futex
  0.05    1.515535       94721        16           restart_syscall
  0.01    0.398242         306      1300           rt_sigprocmask
  0.01    0.196499         302       650           rt_sigaction
  0.00    0.001629         326         5           socket
  0.00    0.001397         349         4           setsockopt
  0.00    0.001021         511         2           getsockname
  0.00    0.000936         234         4           recvmsg
  0.00    0.000767         384         2           open
  0.00    0.000696         232         3         3 connect
  0.00    0.000689         345         2           fstat
  0.00    0.000688         344         2           munmap
  0.00    0.000000           0         2           mmap
  0.00    0.000000           0         2           sendto
  0.00    0.000000           0         2           bind
------ ----------- ----------- --------- --------- ----------------
100.00 3153.051961               7516823   6003693 total

toni-moreno avatar Jun 06 '17 14:06 toni-moreno

It seems to suggest the sleeps should be longer. The sleeps are a vital part in avoiding contention/thundering hurd behaviour, they are not used to synchronise or something.

grobian avatar Jun 06 '17 18:06 grobian

Because poll is now used to see if there's something to read, the read() cost should go down. A semaphore is now used with a timeout, this is where the sleeptime should go now.

grobian avatar Jul 04 '20 15:07 grobian