fluvio icon indicating copy to clipboard operation
fluvio copied to clipboard

Performance: Too many threads

Open sehz opened this issue 2 years ago • 4 comments

steps to reproduce it. run longevity test:

nohup ./target/release/fluvio-test longevity --timeout 90000  -- --runtime-seconds=70000 --producers 100 --consumers 100 > /tmp/test.out 2> /tmp/test.err &

After a while, SPU having too many threads (91)

Name:	fluvio-run
Umask:	0022
State:	S (sleeping)
Tgid:	918569
Ngid:	0
Pid:	918569
PPid:	918456
TracerPid:	0
Uid:	0	0	0	0
Gid:	0	0	0	0
FDSize:	512
Groups:	1 2 3 4 6 10 11 20 26 27 
NStgid:	918569	2722223	1
NSpid:	918569	2722223	1
NSpgid:	918569	2722223	1
NSsid:	918569	2722223	1
VmPeak:	  266964 kB
VmSize:	  256800 kB
VmLck:	       0 kB
VmPin:	       0 kB
VmHWM:	   28184 kB
VmRSS:	   27052 kB
RssAnon:	   12548 kB
RssFile:	   14504 kB
RssShmem:	       0 kB
VmData:	  198528 kB
VmStk:	     132 kB
VmExe:	   30096 kB
VmLib:	       0 kB
VmPTE:	     496 kB
VmSwap:	       0 kB
HugetlbPages:	       0 kB
CoreDumping:	0
THP_enabled:	1
Threads:	91
SigQ:	0/126499
SigPnd:	0000000000000000
ShdPnd:	0000000000000000
SigBlk:	0000000000000000
SigIgn:	0000000200001000
SigCgt:	00000000000004c8
CapInh:	00000000a80425fb
CapPrm:	00000000a80425fb
CapEff:	00000000a80425fb
CapBnd:	00000000a80425fb
CapAmb:	0000000000000000
NoNewPrivs:	0
Seccomp:	0
Speculation_Store_Bypass:	vulnerable
Cpus_allowed:	ff
Cpus_allowed_list:	0-7
Mems_allowed:	00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list:	0
voluntary_ctxt_switches:	321
nonvoluntary_ctxt_switches:	19

This causes SPU to use too much memory. Ideally, we should just have a thread per core and a thread pool to perform blocking I/O. To solve this:

  • Move all I/O to non-blocking async call (io_uring, etc)
  • Assign fixed threads to each core.
  • Carefully tune blocking threads.

sehz avatar Feb 19 '22 05:02 sehz

maybe can use async_std::task::spawn_blocking or convert io code to use async?

ozgrakkurt avatar Mar 22 '23 16:03 ozgrakkurt

Hey @sehz!

Tracking how async-std handles thread count limits:

  • https://github.com/async-rs/async-std/blob/bf316b095c176c8738c6401cc62d0bc389c88961/src/rt/mod.rs#L12
  • https://github.com/async-rs/async-global-executor/blob/0abe723db4ad440f5cebbd06f95b6234a8116398/src/config.rs#L62
  • https://doc.rust-lang.org/stable/std/thread/fn.available_parallelism.html

So it seems to use this by default. And it seems to be controllable by ASYNC_STD_THREAD_COUNT env var.

Apparently this doesnt cover spawn_blocking calls. Those seem like they can be controlled by BLOCKING_MAX_THREADS env var. I found it from here.

So I propose to move all blocking operations to async tasks or at least spawn_blocking calls if they are not done this way already. And try controlling maximum number of threads by configuring the env vars. I think this would be easier than setting up io_uring since it is not very mature yet from what I understand.

I want to work on this if this approach makes sense

ozgrakkurt avatar Mar 22 '23 20:03 ozgrakkurt

all spawn_blocking call should be using this: https://docs.rs/fluvio-future/0.4.5/fluvio_future/task/fn.spawn_blocking.html. Then maybe it just matter of setting BLOCKING_MAX_THREADS?

sehz avatar Mar 22 '23 20:03 sehz

Yeah seems like it. I'll try to do this

ozgrakkurt avatar Mar 22 '23 20:03 ozgrakkurt