bootstrap-vz icon indicating copy to clipboard operation
bootstrap-vz copied to clipboard

Images created with the vagrant plugin have 30 SATA ports, slowing down the boot process

Open EmmanuelKasper opened this issue 9 years ago • 5 comments

I did some benchmarking between the boxes create with Packer and Bootstrap-vz, and I noticed that the boxes created with Bootstrap-vz were taking a significate longer time to boot.

After calling vagrant up:

vagrant ssh -c "systemd-analyze"
Startup finished in 10.548s (kernel) + 4.129s (userspace) = 14.677s

The kernel time seems to relate to a useless 10 seconds hardware detection of the 30 ports on the SATA controller. From dmesg output:

[    1.116157] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    1.116599] ata1.00: ATA-6: VBOX HARDDISK, 1.0, max UDMA/133
[    1.116863] ata1.00: 4194304 sectors, multi 128: LBA48 NCQ (depth 31/32)
[    1.117383] ata1.00: configured for UDMA/133
[    1.117694] scsi 0:0:0:0: Direct-Access     ATA      VBOX HARDDISK    1.0  
PQ: 0 ANSI: 5
[    1.364117] tsc: Refined TSC clocksource calibration: 2882.216 MHz
[    1.436145] ata2: SATA link down (SStatus 0 SControl 300)
[    1.756180] ata3: SATA link down (SStatus 0 SControl 300)
[    2.076219] ata4: SATA link down (SStatus 0 SControl 300)
[    2.364146] Switched to clocksource tsc
[    2.396143] ata5: SATA link down (SStatus 0 SControl 300)
[    2.716144] ata6: SATA link down (SStatus 0 SControl 300)
[    3.036172] ata7: SATA link down (SStatus 0 SControl 300)
[    3.356146] ata8: SATA link down (SStatus 0 SControl 300)
[    3.432109] floppy0: no floppy controllers found
[    3.432348] work still pending
[    3.676142] ata9: SATA link down (SStatus 0 SControl 300)
[    3.996141] ata10: SATA link down (SStatus 0 SControl 300)
[    4.316151] ata11: SATA link down (SStatus 0 SControl 300)
[    4.636149] ata12: SATA link down (SStatus 0 SControl 300)
[    4.956142] ata13: SATA link down (SStatus 0 SControl 300)
[    5.276143] ata14: SATA link down (SStatus 0 SControl 300)
[    5.596146] ata15: SATA link down (SStatus 0 SControl 300)
[    5.916144] ata16: SATA link down (SStatus 0 SControl 300)
[    6.236147] ata17: SATA link down (SStatus 0 SControl 300)
[    6.556117] ata18: SATA link down (SStatus 0 SControl 300)
[    6.876170] ata19: SATA link down (SStatus 0 SControl 300)
[    7.196146] ata20: SATA link down (SStatus 0 SControl 300)
[    7.516145] ata21: SATA link down (SStatus 0 SControl 300)
[    7.836145] ata22: SATA link down (SStatus 0 SControl 300)
[    8.156149] ata23: SATA link down (SStatus 0 SControl 300)
[    8.476143] ata24: SATA link down (SStatus 0 SControl 300)
[    8.796145] ata25: SATA link down (SStatus 0 SControl 300)
[    9.116146] ata26: SATA link down (SStatus 0 SControl 300)
[    9.436131] ata27: SATA link down (SStatus 0 SControl 300)
[    9.756160] ata28: SATA link down (SStatus 0 SControl 300)
[   10.076198] ata29: SATA link down (SStatus 0 SControl 300)
[   10.396153] ata30: SATA link down (SStatus 0 SControl 300)

Setting the number of ports back to 1 solves the long detect time.

vagrant ssh -c "systemd-analyze"
Startup finished in 3.548s (kernel) + 4.059s (userspace) = 7.607s

Now the question is where do this settings come from ? Looking at the OVF manifest that we use, ( https://github.com/andsens/bootstrap-vz/blob/master/bootstrapvz/plugins/vagrant/assets/box.ovf#L189 ) the portCount is OK.

After doing some testing, I noticed that this setting is lost when we import the box with vagrant base box.

So my feeling is that either we hit a bug in VirtualBox import command, or the OVF manifest we use is not completely clean. I notices an OVF file exported from VirtualBox 4.3 does not have this ns= prefixing. in the xml tags. Maybe we should rebase our OVF manifest on something newer ?

As an alternative, it should be also possible to drop the SATA controller and attach our disk image to a IDE controller, which has anyway only 4 ports .

Thanks for your attention.

EmmanuelKasper avatar Sep 04 '15 23:09 EmmanuelKasper

Whoa. Nice catch! That is friggin' weird. I think your suggestion about rebasing the ovf template on something newer is the way to go.

andsens avatar Sep 05 '15 09:09 andsens

Maybe this python OVF library help here ? https://pypi.python.org/pypi/cot

EmmanuelKasper avatar Feb 28 '16 09:02 EmmanuelKasper

Nice find! That would be miles better than what we are doing right now.

andsens avatar Feb 28 '16 09:02 andsens

Maybe this python OVF library help here ?

This doesn't looks to be a library, but a CLI application to deal with other kind of "OVF". Not sure if I understood it correctly...

Anyway, I've tried to tackle this issue today with no luck. What intrigues me most is something that Emmanuel had already suspected:

I notices an OVF file exported from VirtualBox 4.3 does not have this ns= prefixing. in the xml tags.

Looking at the box.ovf file from another box (like Canonical's ubuntu/trusty64) that doesn't have this problem, we can see that they are almost exactly our box.ovf template. You can even diff them. Strangely, the box.ovf generate by bootstrap-vz is much different, with ns0 namespaces all over the place.

I've tried to remove the ns0 namespace, but ended up messing the resulting OVF even more. One time it even broke the box to the point that it couldn't even be started by Vagrant anymore, resulting in a VirtualBox error.

In the end, I can't say for sure that the problem lies in this default namespace being added against our will, because I couldn't fix this... But I would bet all my tips in this. Mostly because the line:

<ns0:StorageController Bootable="true" IDE0MasterEmulationPort="0" IDE0SlaveEmulationPort="1" IDE1MasterEmulationPort="2" IDE1SlaveEmulationPort="3" PortCount="1" name="SATA Controller" type="AHCI" useHostIOCache="false">

Becomes:

<StorageController name="SATA Controller" type="AHCI" PortCount="30" useHostIOCache="false" Bootable="true" IDE0MasterEmulationPort="0" IDE0SlaveEmulationPort="1" IDE1MasterEmulationPort="2" IDE1SlaveEmulationPort="3">

After a vagrant up, when the files from the box are imported from ~/.vagrant.d/ to ~/VirtualBox VMs/.

And why it messes with PortCount only and not any one of the other parameters? Boy, I wish I knew...

myhro avatar Apr 08 '16 11:04 myhro

A possible way to mitigate this is to reset the number of ports of the virtual sata controller with

  config.vm.provider "virtualbox" do |vb|
    vb.customize ["storagectl", :id, "--controller", "IntelAHCI", "--name", "SATA Controller", "--portcount", "1"]
  end

in the Vagrant file. Since we ship a Vagrant file, we could even add it here

However as @myhro alluded, I suspect the root of the problem is the write_ovf method in https://github.com/andsens/bootstrap-vz/blob/cf6234bafd0e18088c8eef4aea0ed8cfb256284d/bootstrapvz/plugins/vagrant/tasks.py#L150/tasks.py who might do something wrong ( also maybe explaining the ns0 stuff everywhere)

I tried to replace the box.ovf template we ship with a different ovf template from a fresh virtualbox export but I got parsing errors afterwards. Maybe someone woth python/xpath knowledge might have a better view of this.

EmmanuelKasper avatar Jun 04 '16 19:06 EmmanuelKasper