coreos-installer
coreos-installer copied to clipboard
downloads/installs slowly
In kola testiso we spawn a local webserver and point qemu at it. The legacy installer downloads and writes to disk in about one second. This new one takes...about 4 minutes. Something must be going terribly wrong :smile:
I didn't bother to try to debug this much yet, but it seems like the host webserver is spending a lot of time idle.
I'm going to guess two things: not using O_DIRECT anymore (via dd's oflag=direct). and possibly using a small block size (with dd we had bs=1M). std::io::copy does opportunistically use copy_file_range, but I'm not sure if this works for block devices. It may be falling back to a read & write loop in 8K blocks, which... will take a while. Probably worth using our own 1M read & write loop and possibly using O_DIRECT.
Edit: halfway through that comment I actually did check the Rust implementation but forgot to link to the relevant bit: https://github.com/rust-lang/rust/blob/c20d7eecbc0928b57da8fe30b2ef8528e2bdd5be/src/libstd/sys/unix/fs.rs#L1202
That's fs::copy. io::copy seems to be doing a straightforward copy loop.
Ahh yup, you're right. So we're definitely falling back to 8k blocks then.
We could pretty easily reimplement copy() on top of copy_n() from #187.
I did this but it fails to compile because...well, Rust correctly points out that it wouldn't be threadsafe :tada:
diff --git a/Cargo.lock b/Cargo.lock
index 5f75a3a..594370f 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -119,6 +119,7 @@ dependencies = [
"byte-unit 3.0.3 (registry+https://github.com/rust-lang/crates.io-index)",
"clap 2.33.0 (registry+https://github.com/rust-lang/crates.io-index)",
"cpio 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)",
+ "crossbeam-utils 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)",
"error-chain 0.12.2 (registry+https://github.com/rust-lang/crates.io-index)",
"flate2 1.0.14 (registry+https://github.com/rust-lang/crates.io-index)",
"maplit 1.0.2 (registry+https://github.com/rust-lang/crates.io-index)",
@@ -146,6 +147,16 @@ dependencies = [
"cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
]
+[[package]]
+name = "crossbeam-utils"
+version = "0.7.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+dependencies = [
+ "autocfg 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)",
+ "cfg-if 0.1.10 (registry+https://github.com/rust-lang/crates.io-index)",
+ "lazy_static 1.4.0 (registry+https://github.com/rust-lang/crates.io-index)",
+]
+
[[package]]
name = "dtoa"
version = "0.4.5"
@@ -1195,6 +1206,7 @@ dependencies = [
"checksum core-foundation-sys 0.6.2 (registry+https://github.com/rust-lang/crates.io-index)" = "e7ca8a5221364ef15ce201e8ed2f609fc312682a8f4e0e3d4aa5879764e0fa3b"
"checksum cpio 0.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "f00177cda1bb5c5b0a05b1b506125c776411cdfc6a41085126881266e0cdc242"
"checksum crc32fast 1.2.0 (registry+https://github.com/rust-lang/crates.io-index)" = "ba125de2af0df55319f41944744ad91c71113bf74a4646efff39afe1f6842db1"
+"checksum crossbeam-utils 0.7.2 (registry+https://github.com/rust-lang/crates.io-index)" = "c3c7c73a2d1e9fc0886a08b93e98eb643461230d5f1925e4036204d5f2e261a8"
"checksum dtoa 0.4.5 (registry+https://github.com/rust-lang/crates.io-index)" = "4358a9e11b9a09cf52383b451b49a169e8d797b68aa02301ff586d70d9661ea3"
"checksum encoding_rs 0.8.22 (registry+https://github.com/rust-lang/crates.io-index)" = "cd8d03faa7fe0c1431609dfad7bbe827af30f82e1e2ae6f7ee4fca6bd764bc28"
"checksum error-chain 0.12.2 (registry+https://github.com/rust-lang/crates.io-index)" = "d371106cc88ffdfb1eabd7111e432da544f16f3e2d7bf1dfe8bf575f1df045cd"
diff --git a/Cargo.toml b/Cargo.toml
index 956524c..481d8a6 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -27,6 +27,7 @@ lto = true
byte-unit = "^3.0"
clap = "^2.33"
cpio = "^0.2"
+crossbeam-utils = "0.7.2"
error-chain = { version = "^0.12", default-features = false }
flate2 = "^1.0"
nix = "^0.17"
diff --git a/src/download.rs b/src/download.rs
index 45761be..38ea83c 100644
--- a/src/download.rs
+++ b/src/download.rs
@@ -13,12 +13,14 @@
// limitations under the License.
use byte_unit::Byte;
+use crossbeam_utils::thread;
use error_chain::bail;
use flate2::read::GzDecoder;
use nix::unistd::isatty;
use progress_streams::ProgressReader;
use std::fs::{remove_file, File, OpenOptions};
-use std::io::{copy, stderr, BufRead, BufReader, Read, Seek, SeekFrom, Write};
+use std::io::{stderr, BufRead, BufReader, Read, Seek, SeekFrom, Write};
+use std::os::unix::io::FromRawFd;
use std::num::NonZeroU32;
use std::os::unix::io::AsRawFd;
use std::path::{Path, PathBuf};
@@ -162,6 +164,35 @@ fn write_image_and_sig(
Ok(())
}
+fn threaded_pipe_copy<R : Read + Send>(src: R, dest: &mut File) -> Result<()> {
+ let mut buf = Vec::new();
+ buf.reserve_exact(64 * 1024 * 1024);
+ let mut buf = buf.as_mut_slice();
+
+ let (pipe_r, pipe_w) = nix::unistd::pipe()?;
+ let mut pipe_r = unsafe { std::fs::File::from_raw_fd(pipe_r) };
+ let mut pipe_w = unsafe { std::fs::File::from_raw_fd(pipe_w) };
+ Ok(thread::scope(|s| -> Result<()> {
+ let h1 = s.spawn(move |_| -> Result<()> {
+ loop {
+ let len = match src.read(buf) {
+ Ok(0) => return Ok(()),
+ Ok(len) => len,
+ Err(ref e) if e.kind() == std::io::ErrorKind::Interrupted => continue,
+ Err(e) => return Err(e.into()),
+ };
+ pipe_w.write_all(&buf[..len])?;
+ }
+ });
+ let h2 = s.spawn(move |_| -> Result<()> {
+ std::io::copy(&mut pipe_r, dest)?;
+ Ok(())
+ });
+ h1.join().expect("thread join")?;
+ Ok(h2.join().expect("thread join")?)
+ }).expect("scope join")?)
+}
+
/// Copy the image to disk and verify its signature.
pub fn write_image(
source: &mut ImageSource,
@@ -286,7 +317,7 @@ pub fn write_image(
// sparse-copy the image, falling back to non-sparse copy if hardware
// acceleration is unavailable. But BLKZEROOUT doesn't support
// BLKDEV_ZERO_NOFALLBACK, so we'd risk gigabytes of redundant I/O.
- copy(&mut decompress_reader, dest).chain_err(|| "decoding and writing image")?;
+ threaded_pipe_copy(&mut decompress_reader, dest).chain_err(|| "decoding and writing image")?;
// verify_reader has now checked the signature, so fill in the first MiB
dest.seek(SeekFrom::Start(0))
This is an interesting case because we definitely want to run decompression in a separate thread; but it's also intermixed with progress and GPG and...
Also, we should be using O_DIRECT right?
OK wow I think this got worse:
[core@ibm-p8-kvm-03-guest-02 ~]$ [ 24.279163] coreos-installer[1543]: Read disk 55.2 MiB/854.7 MiB (6%)
[ 25.282045] coreos-installer[1543]: Read disk 66.4 MiB/854.7 MiB (7%)
[ 26.284139] coreos-installer[1543]: Read disk 75.7 MiB/854.7 MiB (8%)
[ 27.508213] coreos-installer[1543]: Read disk 76.3 MiB/854.7 MiB (8%)
[ 28.547356] coreos-installer[1543]: Read disk 76.3 MiB/854.7 MiB (8%)
[ 30.954612] coreos-installer[1543]: Read disk 76.9 MiB/854.7 MiB (9%)
[ 32.736845] coreos-installer[1543]: Read disk 77.0 MiB/854.7 MiB (9%)
[ 34.501298] coreos-installer[1543]: Read disk 77.0 MiB/854.7 MiB (9%)
[ 36.193813] coreos-installer[1543]: Read disk 77.0 MiB/854.7 MiB (9%)
[ 37.962886] coreos-installer[1543]: Read disk 77.1 MiB/854.7 MiB (9%)
[ 39.774410] coreos-installer[1543]: Read disk 77.1 MiB/854.7 MiB (9%)
[ 41.607017] coreos-installer[1543]: Read disk 77.1 MiB/854.7 MiB (9%)
[ 42.616462] coreos-installer[1543]: Read disk 77.2 MiB/854.7 MiB (9%)
[ 44.969971] coreos-installer[1543]: Read disk 79.3 MiB/854.7 MiB (9%)
[ 46.725812] coreos-installer[1543]: Read disk 79.3 MiB/854.7 MiB (9%)
[ 48.567503] coreos-installer[1543]: Read disk 79.3 MiB/854.7 MiB (9%)
[ 49.973242] coreos-installer[1543]: Read disk 79.4 MiB/854.7 MiB (9%)
[ 50.975088] coreos-installer[1543]: Read disk 94.3 MiB/854.7 MiB (11%)
[ 53.773692] coreos-installer[1543]: Read disk 98.7 MiB/854.7 MiB (11%)
[ 55.649689] coreos-installer[1543]: Read disk 98.7 MiB/854.7 MiB (11%)
[ 57.514102] coreos-installer[1543]: Read disk 98.7 MiB/854.7 MiB (11%)
[ 59.439394] coreos-installer[1543]: Read disk 98.8 MiB/854.7 MiB (11%)
[ 61.198396] coreos-installer[1543]: Read disk 98.8 MiB/854.7 MiB (11%)
[ 62.947567] coreos-installer[1543]: Read disk 98.8 MiB/854.7 MiB (11%)
[ 64.696044] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[ 66.244836] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[ 68.015308] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[ 69.766382] coreos-installer[1543]: Read disk 98.9 MiB/854.7 MiB (11%)
[ 71.554605] coreos-installer[1543]: Read disk 99.0 MiB/854.7 MiB (11%)
[ 73.299237] coreos-installer[1543]: Read disk 99.0 MiB/854.7 MiB (11%)
[ 75.035509] coreos-installer[1543]: Read disk 99.0 MiB/854.7 MiB (11%)
[ 76.786172] coreos-installer[1543]: Read disk 99.1 MiB/854.7 MiB (11%)
[ 78.529187] coreos-installer[1543]: Read disk 99.1 MiB/854.7 MiB (11%)
[ 80.278279] coreos-installer[1543]: Read disk 99.1 MiB/854.7 MiB (11%)
[ 82.031717] coreos-installer[1543]: Read disk 99.2 MiB/854.7 MiB (11%)
[ 83.782988] coreos-installer[1543]: Read disk 99.2 MiB/854.7 MiB (11%)
[ 85.560431] coreos-installer[1543]: Read disk 99.2 MiB/854.7 MiB (11%)
[ 87.320502] coreos-installer[1543]: Read disk 99.3 MiB/854.7 MiB (11%)
[ 89.070928] coreos-installer[1543]: Read disk 99.3 MiB/854.7 MiB (11%)
[ 90.853538] coreos-installer[1543]: Read disk 99.3 MiB/854.7 MiB (11%)
[ 92.603431] coreos-installer[1543]: Read disk 99.4 MiB/854.7 MiB (11%)
[ 94.353034] coreos-installer[1543]: Read disk 99.4 MiB/854.7 MiB (11%)
[ 95.387746] coreos-installer[1543]: Read disk 99.4 MiB/854.7 MiB (11%)
[ 96.390873] coreos-installer[1543]: Read disk 103.9 MiB/854.7 MiB (12%)
[ 97.390964] coreos-installer[1543]: Read disk 107.8 MiB/854.7 MiB (12%)
[ 98.395250] coreos-installer[1543]: Read disk 111.4 MiB/854.7 MiB (13%)
[ 99.401100] coreos-installer[1543]: Read disk 115.4 MiB/854.7 MiB (13%)
[ 100.416172] coreos-installer[1543]: Read disk 118.0 MiB/854.7 MiB (13%)
[ 101.422613] coreos-installer[1543]: Read disk 120.9 MiB/854.7 MiB (14%)
[ 102.429683] coreos-installer[1543]: Read disk 124.7 MiB/854.7 MiB (14%)
[ 103.429702] coreos-installer[1543]: Read disk 128.5 MiB/854.7 MiB (15%)
[ 104.433470] coreos-installer[1543]: Read disk 131.7 MiB/854.7 MiB (15%)
[ 105.435846] coreos-installer[1543]: Read disk 135.8 MiB/854.7 MiB (15%)
[ 106.437101] coreos-installer[1543]: Read disk 138.7 MiB/854.7 MiB (16%)
[ 107.437893] coreos-installer[1543]: Read disk 143.0 MiB/854.7 MiB (16%)
[ 108.444530] coreos-installer[1543]: Read disk 146.3 MiB/854.7 MiB (17%)
That's ~1.08 MB/s which is absolutely awful.
I've noticed it being really really slow lately too. Like REALLY slow.
Here's a profile from current git master:
https://perfht.ml/2xLdVoJ
Basically I think what this is saying is that we are losing out by doing decompression and I/O on the same core - this type of thing is where classical Unix pipelines like curl | gunzip > /dev/disk win.
BTW I learned about the cool profiler GUI above from this: https://www.reddit.com/r/rust/comments/g0wkp0/til_firefox_profiler_makes_an_awesome_gui_for_perf/
OK and here's some code I wrote to try to do the decompression/write in separate threads: https://github.com/cgwalters/coreos-installer/tree/copy-faster But...it seems not to help not too much...reduces 25 seconds to 20 or so: https://profiler.firefox.com/public/7aa5ff8e8b9225854fcbb3cf793874fb9c2363da/calltree/?globalTrackOrder=0&localTrackOrderByPid=1936-0-1~&thread=0&v=4
@cgwalters your initial profile also had a lot of trips through read and write, are we possibly missing some buffering? Also, does anything change if you switch the underlying implementation in flate2?
Hmm you know, a raw
[root@toolbox ~]# time /bin/sh -c 'gunzip < /host/var/mnt/workdir-tmp/fedora-coreos-31.20200327.dev.2-metal.x86_64.raw.gz > /dev/vdb'
real 0m21.823s
user 0m10.501s
sys 0m1.403s
Is basically the same order as the installer, as well as:
[root@toolbox ~]# time /bin/sh -c 'gunzip < /host/var/mnt/workdir-tmp/fedora-coreos-31.20200327.dev.2-metal.x86_64.raw.gz | dd of=/dev/vdb iflag=fullblock bs=8M oflag=direct'
341+0 records in
341+0 records out
2860515328 bytes (2.9 GB, 2.7 GiB) copied, 21.8208 s, 131 MB/s
real 0m21.823s
user 0m10.621s
sys 0m1.622s
[root@toolbox ~]#
In fact I'm being stupid here, 20s is nowhere near the minutes I've seen elsewhere. So the problem may actually lie somewhere else in testiso.go somehow?
And of course for my tests I should have fetched over HTTP...I just thought the problem was lower level than that. Here's a http fetch profile:
https://profiler.firefox.com/public/5c8ac1a2603e6d04abe1eb8101b389f164fdc0e2/calltree/?globalTrackOrder=0&localTrackOrderByPid=2366-0-1-2~&thread=0&v=4
I've noticed better performance using --image-file:
$ time sudo coreos-installer install /dev/sda --ignition-file example.ign --insecure --image-file ./fedora-coreos-31.20200415.dev.0-metal.x86_64.raw.gz
Copying image from ./fedora-coreos-31.20200415.dev.0-metal.x86_64.raw.gz
Reading signature from ./fedora-coreos-31.20200415.dev.0-metal.x86_64.raw.gz.sig
Couldn't read signature file: No such file or directory (os error 2)
Signature not found; skipping verification as requested
> Read disk 712.8 MiB/712.8 MiB (100%)
Writing Ignition config
Install complete.
real 0m53.368s
user 0m10.889s
sys 0m4.245s
This is opposed to 8+ minutes at times over local http (via python3 -m http.server).
#290 switched to a copy() implementation with a larger buffer.
I spent some more time on this and got nowhere, except to find that #290 was incorrect. https://github.com/coreos/coreos-installer/pull/315 should be more correct, but may not help much either.