Accelerator's count and read_thread's count were used as criteria for sweep evaluation, but the results are confusing.

Open xdreamcoder opened this issue 5 months ago • 0 comments

I ran evaluations while varying the number of accelerators and the number of read threads. I expected the I/O throughput to scale proportionally with accelerator × read_thread, but as shown below the performance drops dramatically when I use five accelerators. Even if I keep increasing the number of accelerators and read threads, I thought the system would eventually hit a bottleneck, yet instead of a gradual bottleneck the performance collapses.

For the tests I only added the read_thread option to the default run command you provided, and before each run I re‑formatted the NVMe device and remounted the XFS filesystem (the initialization step).

Could you help me figure out why I’m getting these results? Is there something wrong with my setup?

Aug 12 '25 05:08 xdreamcoder