Results 371 comments of Travis Downs

> On Arm (AArch64), you are using the **YIELD** instruction to delay when spinning. This is not the purpose of the **YIELD** instruction. I am not using it to delay....

@nyh wrote: > where are all those fairly big "source location" strings saved in the executable (I'm not talking about the 8-byte pointers, which are also a problem - I'm...

> There's no DIO thread. Stephan is referring to the kernel dio worker, spawned when doing aio direct IO against FS (to handle completions, etc).

> aio works without any thread normally. There's a dio worker needed to handle write completions, Jens describes it (and removes it some scenarios, but this is very recent and...

@avikivity - let's let the DIO worker/thread thing go for now: it was only an example of something that _might_ be going wrong. Can you please give the OP a...

> 91 IOPS = 10ms latency if everything is serialized. Yes exactly. > Suggest using kernel-level tools to understand. Given the limited but not zero use of RHEL8 in practice,...

> I can understand this happening once or twice: the file's map isn't there (could happen after a cold start), so we have to have a thread to handle the...

The only practical way I know of to avoid UWEC issues are to actually zero large blocks of the file ahead of application use in a chunky way: but this...

> Later we recycle the segment to avoid the 2X write amplification. Yes, exactly. Regardless of the O_DSYNC and other stuff (this is not obviously better, as discussed elsewhere) that's...

> This is what fsqual measures - context switches involved with writes. If you received a GOOD score, no context switches, so no workqueues. Hmm, I think fsqual led you...