moosefs icon indicating copy to clipboard operation
moosefs copied to clipboard

Very poor performance with concurrent random reads

Open njaard opened this issue 5 years ago • 2 comments

mfs version: 3.0.109 os: Debian buster hw: 16-core AMD Epyc net: 1Gbit ethernet (3 chunkservers) and 10Gbit ethernet (for one chunkserver)

I have a process that does a lot of random reads over 10,000-ish files, totalling 175TB data. This process opens all the files, and then spawns a bunch of threads and each thread will randomly select a file descriptor and do a pread at a "random" location in the file.

Of course random reads will be slower than consecutive reads, but I am frustrated by how if I can improve performance "for free" by simply partitioning my reads over multiple mfsmounts.

It seems like mfsmount should fairly treat each reading thread: one thread should not void the cache by reading at the same time as another thread.

njaard avatar Dec 11 '19 22:12 njaard

Random reads scream out for SSD's or NVMEs.

The biggest performance improvement you can get is by making sure you are using Jumbo Frames (MTU 9000 on all interfaces). Note that many switches and routers don't support this.

dagelf avatar Apr 13 '22 06:04 dagelf

Be careful with Jumbo Frames, we don't recommend them as a general rule, because usually they cause problems, not help. In certain configurations they could be helpful, but it' rather rare.

@njaard MooseFS doesn't void any cache while reading, only when writing on a different machine. Reading with many threads doesn't change anything. But we have no control over kernel cache (i.e. kernel policies). So, unless you see any information in oplog that you think shouldn't be there and that indicates cache voiding, this is all kernel. Perhaps, when you use several mounts, kernel reserves cache space for each one separately?

chogata avatar Apr 14 '22 11:04 chogata