zos
zos copied to clipboard
storage: concurrent disk image write.
Write it concurrently to speed it up from the previous sequential write.
Description
Change image DiskWrite from sequential to concurrent, to make it faster.
Changes
- change to concurrent write
Related Issues
Fixes :
- #2391
- #2405
Checklist
- [x] Tests included -> manual test
- [x] Build pass
- [x] Documentation
- [x] Code format and docstring
@muhamadazmy The benchmarking already in the linked issue. Including here because here is indeed the better place
I've tried to paralelize the io.Copy with this environment:
- zos node: In Indonesia, run on my old 2016 PC on qemu. (indonesia is close to Australia)
- os image: nixos
the results:
- original io.Copy: 1 hour (much worse than the reported 20 mins :))
- 5 goroutines: 20 mins
- 10 goroutines: 20 mins
- 15 goroutines: 24 mins
That's cool.
One thing i had to add. Flists with raw images are obsolete and should not be used anymore. Zos still supports it for backwards compatibility only but definitely new workloads with this kind of images should not be allowed.
Forgot to say that make sure to clean up the cache between the benchmarks runs. Since rfs caches the downloaded content in zos-cache. Means that second run will always go faster than first run since it doesn't have to download the image again
Forgot to say that make sure to clean up the cache between the benchmarks runs. Since rfs caches the downloaded content in zos-cache. Means that second run will always go faster than first run since it doesn't have to download the image again
sure, it only took ~2 mins with the cache.
actually i not only deleted the cache, but delete all the qemu cow disks because i don't know how to delete it.
I guess it is under /var/cache/modules/flistd?
var/cache/modules/flistd # ls
cache flist log mountpoint pid ro
All in all looks good, also I wonder about if that code path should be enabled in case of HDD only nodes?
All in all looks good, also I wonder about if that code path should be enabled in case of HDD only nodes?
can you elaborate more on this?
Is it because HDD only node will be slow?
And how is provisiond behavior regarding this?
All in all looks good, also I wonder about if that code path should be enabled in case of HDD only nodes?
can you elaborate more on this? Is it because HDD only node will be slow? And how is
provisiondbehavior regarding this?
zos right now supports also HDD nodes, I'm believe sequential nature of HDD would have the performance impacted (worse) with the concurrency
on another note I'd also add the concept of retries to the code if possible
zos right now supports also HDD nodes, I'm believe sequential nature of HDD would have the performance impacted (worse) with the concurrency
~~This is true if we write to to regular file, but this is not the case here:~~
- ~~the destination is
rfs, andrfsdoesn't the image in a single regular file.~~ - also need to be aware that the slowness is on the downloading side (from hub) , not on the write side to the disk image
As a side note, current code is favoring SSD disks
on another note I'd also add the concept of retries to the code if possible
why not handle it inside rfs?
zos right now supports also HDD nodes, I'm believe sequential nature of HDD would have the performance impacted (worse) with the concurrency
I assume it won't happen because the slowness from remote rfs will cover it.
But because it is hard to test, i think disable it on HDD-only is safer.
fixed it on caf83ac67e274e6ea04e640701eb7506fd8c500d @xmonader