firecracker icon indicating copy to clipboard operation
firecracker copied to clipboard

[Bug] FC doesn't parse kernel command lines containing -- correctly

Open upxe opened this issue 2 years ago • 7 comments

Describe the bug

Firecracker since PR #2716 (issue #2709) attempts to parse the command line to inject virtio MMIO kernel parameters before any -- in the command line.

That PR has two problems:

  1. It doesn't require the -- to be whitespace-delimited: it incorrectly parses e.g. foo --bar as a kernel parameter of foo and an init argument of bar. The kernel would interpret it as two kernel parameters (foo and --bar), and no init arguments.

  2. It rejects kernel command lines containing multiple -- substrings, irrespective of whitespace delimiting. This means that you can't pass arguments to init that contain --.

To Reproduce

Run a Linux VM with a kernel command line of e.g. foo -- --bar. This is a valid kernel command line: foo is a kernel parameter; --bar is passed as argv[1] to /sbin/init.

Any kernel command containing multiple -- substrings will fail to boot. Any command line containing a non-whitespace-delimited -- substring will be parsed incorrectly. For example, use the reproducer in #2709 but put --dummy at the start of boot_args, before the console=....

Expected behaviour

Firecracker starts the VM.

Environment

  • Firecracker version: 0.25.2 and later
  • Host and guest kernel versions: 5.10.x
  • Rootfs used: none
  • Architecture: x86_64
  • Any other relevant software versions: none

Additional context

This breaks our production use of Firecracker, which uses Linux VMs with an /sbin/init that takes --foo-style arguments, passed on the kernel command line.

I'll submit a PR that splits the command line in a way that more closely follows the kernel's behaviour (parse_args() in kernel/params.c).

Checks

  • [x] Have you searched the Firecracker Issues database for similar problems?
  • [x] Have you read the existing relevant Firecracker documentation?
  • [x] Are you certain the bug being reported is a Firecracker issue?

upxe avatar Jun 01 '22 15:06 upxe

We are currently considering the option of integrating this functionality upstream in rust-vmm, since other projects might be affected by this.

We will follow up once we have made a decision.

bchalios avatar Jun 28 '22 08:06 bchalios

@upxe do you have an example of a boot argument that follows the --bar pattern you are mentioning in the issue? From this documentation I understand that the kernel interprets everything after the first -- as init arguments. Indeed, in Firecracker right now we don't except more than one -- in the kernel command line which leads to Problem 2 as you describe it, but I don't think that we need to handle Point 1.

andreeaflorescu avatar Aug 31 '22 14:08 andreeaflorescu

I investigated this issue a little bit and this is a summary of what I discovered / tried:

  1. According to this documentation it looks like there is no need for a whitespace as you said @andreeaflorescu ; however, the findings below shows that a whitespace is required
  2. The linux kernel seems to parse kernel params extracting each token (sequence of chars between whitespaces) until -- token is found: https://elixir.bootlin.com/linux/v5.10.139/source/kernel/params.c#L184 ; that means we need to check for the whitespace after -- sequence
  3. Because the linux kernel does the parsing by extracting a token and comparing it with --, in case something like foo ---bar is provided, the linux kernel will extract foo as a token followed by ---bar; none of them equals -- so both of them will be interpreted as kernel parameters
  4. I took -b as an example of a valid parameter that could be sent to init (see http://www.skrenta.com/rt/man/init.8.html , section BOOTFLAGS); -e should boot the VM in emergency mode; I checked the following two situations in Firecracker:
  • [ 0.000000] Command line: console=ttyS0 reboot=k panic=1 pci=off root=/dev/vda rw virtio_mmio.device=4K@0xd0000000:5 -- -b

Running like this, the VM booted in emergency mode, meaning that the -e was parsed (correctly) as init arg.

  • [ 0.000000] Command line: console=ttyS0 reboot=k panic=1 pci=off root=/dev/vda rw virtio_mmio.device=4K@0xd0000000:5 ---b

Running like this, the VM did NOT boot in emergency mode, meaning that the -- was not identified as a separator between kernel params and init args.

So, that being said, it looks to me t.=hat the whitespace is required after --, even though the spec does NOT specify it explicitly.

What do you think? Am I right? Did I miss something?

andreitraistaru avatar Sep 01 '22 07:09 andreitraistaru

Great investigation @andreitraistaru! That looks correct and matches my understanding as well. I think with this though minimal changes are required to the Firecracker code to fix the problem:

  • use a delimiter that takes into consideration the whitespace
  • use splitn instead of split so that we always get at most 2 tokens: what is before --[whitespace] and everything that is after: https://doc.rust-lang.org/std/primitive.str.html#method.splitn

We can propagate this change to rust-vmm as well, but as we discussed offline the rust-vmm code needs a bit of a redesign to be able to accommodate the change.

andreeaflorescu avatar Sep 01 '22 10:09 andreeaflorescu

Ok, sounds good! I'll come up with the code changes for the rust-vmm and the integration in Firecracker of the bugfix.

andreitraistaru avatar Sep 01 '22 10:09 andreitraistaru

@andreeaflorescu, @andreitraistaru thank you for your input. The kernel documentation is ambiguous in its description of --. In particular, "Everything after “--” is passed as an argument to init" is misleading, in that it doesn't mention that that -- should be whitespace-delimited. It's also slightly more complicated than that because the kernel supports double-quoting of command line arguments (e.g foo="bar -- qux" blah, where the -- is part of the foo value, not a separator). My patch doesn't cover that.

For clarity, our production workloads require multiple "--foo" and "--foo=bar" style arguments to init passed on the kernel command line, and the format of those arguments is beyond our control. We thus need to support kernel command lines like console=ttyS0 panic=1 -- --foo=bar --qux=abc in Firecracker, where Firecracker must inject its virtio MMIO arguments before the -- delimiter. We don't need correct handling of double-quoted arguments.

I've attached a tool for comparing kernel command line handling in Firecracker and QEMU. Please remove the .txt filename extensions, added to placate GitHub:

To run:

  1. Compile a kernel with the given kernel configuration (I used 5.10.140).
  2. Put both bzImage and vmlinux in the current directory.
  3. Ensure that firecracker, rustc, jq, qemu-system-x86_64 are on your PATH.
  4. Run ./test 'kernel command line'. This will output the init argv and environment under QEMU and Firecracker:
$ ls
bzImage  firecracker*  kernel.config  main.rs  test*  vmlinux*

$ ./test 'kernel command line'
Compiling main.rs
Building initrd
1050 blocks
Kernel command line: >>>kernel command line<<<

QEMU
argv[0] = /init
argv[1] = kernel
argv[2] = command
argv[3] = line
HOME=/
TERM=linux

Firecracker
argv[0] = /init
argv[1] = kernel
argv[2] = command
argv[3] = line
HOME=/
TERM=linux

Outputs are the same

With this we can investigate the bug. In all cases below, AFAICT QEMU is correct; Firecracker is incorrect:

  1. Case 1 in the description: the -- that separates kernel args from init args should be whitespace-delimited. In the following, init should have no arguments and an environment variable called foo--bar with a value of 42:

    $ ./test 'foo--bar=42'
    Kernel command line: >>>foo--bar=42<<<
    
    QEMU
    argv[0] = /init
    HOME=/
    TERM=linux
    foo--bar=42
    
    Firecracker
    argv[0] = /init
    argv[1] = foo
    HOME=/
    TERM=linux
    --bar=42
    
  2. Case 2 in the description: multiple -- should be supported:

    $ ./test '--foo --bar'
    Kernel command line: >>>--foo --bar<<<
    
    QEMU
    argv[0] = /init
    argv[1] = --foo
    argv[2] = --bar
    HOME=/
    TERM=linux
    
    Firecracker
    2022-09-01T12:29:12.282409365 [anonymous-instance:main:ERROR:src/firecracker/src/main.rs:498] Building VMM configured from cmdline json failed: KernelCmdline("Too many `--` in kernel cmdline.")
    
  3. Double quotes aren't handled correctly:

    $ ./test 'foo="bar--qux" blah'
    Kernel command line: >>>foo="bar--qux" blah<<<
    
    QEMU
    argv[0] = /init
    argv[1] = blah
    HOME=/
    TERM=linux
    foo=bar--qux
    
    Firecracker
    argv[0] = /init
    argv[1] = blah
    HOME=/
    TERM=linux
    foo=bar --qux
    

Based on my reading of the kernel source code (parse_args in kernel/params.c), the kernel command line is parsed thus:

  1. Tokenise by splitting on whitespace that is not within double quotes. (Whitespace within double quotes must be [a] preserved, and [b] ignored from a tokenisation perspective.)

  2. If that produces a token of -- then all tokens after the -- are passed to init. (Again, you can't just try a substring search / splitting on --, because that's not aware of double quotes.)

  3. Tokens before -- are of the form foo, foo.bar, foo=qux or foo.bar=qux (i.e. a name, and optionally a value; the name may contain a .). Tokens with names unknown to the kernel and not containing a . are passed to init.

  4. "passed to init" means:

    • If foo=bar then add a variable foo with value bar to init's environment.
    • Else append to init's command line arguments.

upxe avatar Sep 01 '22 11:09 upxe

We are considering the rework of the cmdline parser to cover this usecase as well. I'll have a look on that and let you know the progress on our side.

andreitraistaru avatar Sep 06 '22 09:09 andreitraistaru