nixos-rebuild silently crashes out of memory without updating
I hesitated to file this one on nixpkgs.
On a VM with 512Mb of ram, I've seen the following happen:
$ nixos-rebuild boot --upgrade
unpacking channels...
created 2 symlinks in user environment
building Nix...
building the system configuration...
these derivations will be built:
/nix/store/xqh93bd85ks37l9b30rwa3d4p99qqa2a-system-path.drv
/nix/store/46lgkbf29k9wj7bcrxxfq1sk7pqhqpq3-unit-polkit.service.drv
[…]
these paths will be fetched (55.46 MiB download, 63.54 MiB unpacked):
/nix/store/0ksg3q70h5n4x1v70gvghz45xax6w52n-nixos-version
/nix/store/7gbd1as2whimg6a0d6rfdis5c9syxsl2-linux-4.14.97
[…]
copying path '/nix/store/0ksg3q70h5n4x1v70gvghz45xax6w52n-nixos-version' from 'https://cache.nixos.org'...
copying path '/nix/store/gywc473i8ahighmsj9s6kfik5by9x69a-kernel-modules' from 'https://cache.nixos.org'...
building '/nix/store/5k76blki8zxjcbr3qp9h5jfcin21h918-etc-nixos.conf.drv'...
building '/nix/store/mr6731c960n1j6vj99slmrhqjpy7bafw-etc-os-release.drv'...
[…]
collision between `/nix/store/kkdknnfhqmkb8p9pmmww4xivz6aa9w9f-inetutils-1.9.4/bin/hostname' and `/nix/store/00bgd045z0d4icpbc2yyz4gx48ak44la-net-tools-1.60_p20170221182432/bin/hostname'
collision between `/nix/store/kkdknnfhqmkb8p9pmmww4xivz6aa9w9f-inetutils-1.9.4/bin/dnsdomainname' and `/nix/store/00bgd045z0d4icpbc2yyz4gx48ak44la-net-tools-1.60_p20170221182432/bin/dnsdomainname'
created 7798 symlinks in user environment
If you don't look closely, everything appears to be fine. But…
$ echo $?
137
A bit of investigation reveals the OOM killer kicked in:
Jan 06 16:53:39 hostname kernel: update-mime-dat invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=(null), order=0, oom_score_adj=0
Jan 06 16:53:39 hostname kernel: update-mime-dat cpuset=/ mems_allowed=0
Jan 06 16:53:39 hostname kernel: CPU: 0 PID: 15023 Comm: update-mime-dat Not tainted 4.14.86 #1-NixOS
Jan 06 16:53:39 hostname kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
Jan 06 16:53:39 hostname kernel: Call Trace:
Jan 06 16:53:39 hostname kernel: dump_stack+0x5c/0x85
Jan 06 16:53:39 hostname kernel: [10007] 0 10007 32862 145 10 3 0 0 nixos-rebuild
Jan 06 16:53:39 hostname kernel: [10008] 0 10008 110701 84161 216 3 0 0 nix-build
Jan 06 16:53:39 hostname kernel: [10010] 0 10010 65529 5743 64 3 0 0 nix-daemon
Jan 06 16:53:39 hostname kernel: [14702] 30001 14702 4146 311 11 2 0 0 bash
Jan 06 16:53:39 hostname kernel: [15023] 30001 15023 17141 11334 37 2 0 0 update-mime-dat
Jan 06 16:53:39 hostname kernel: Out of memory: Kill process 10008 (nix-build) score 653 or sacrifice child
Jan 06 16:53:39 hostname kernel: Killed process 10008 (nix-build) total-vm:442804kB, anon-rss:336644kB, file-rss:0kB, shmem-rss:0kB
Jan 06 16:53:39 hostname kernel: oom_reaper: reaped process 10008 (nix-build), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
It seems that more often than not, this command https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/config/xdg/mime.nix#L27 gets the memory usage over the line and causes nix-build to be killed.
Ain't nothing wrong with that but my main source of grief is that the output looks so benign that I assume the command succeeded. Not sure why I don't see an OOM error message.
Nixpkgs rev: 135a7f9604c Nix version 2.1.3
I usually stop several services before doing the rebuild (nixos-rebuild switch).
The problem is that sometimes even that doesn't help (e.g. when I add a new package that needs building and it consumes a lot of memory) and then those services are still down (e.g. httpd, MySQL, ..)
and what's the solution here? unable to complete rebuild on a system with 3gb available ram. all services stopped, etc. always oom-kill
Which brings another question - is nix unusable on systems with less than 8gb ram? and exactly how do we rebuild/switch those systems?
and what's the solution here? unable to complete rebuild on a system with 3gb available ram. all services stopped, etc. always oom-kill
Which brings another question - is nix unusable on systems with less than 8gb ram? and exactly how do we rebuild/switch those systems?
For now, can you not make a swapfile of a few GB? My Pinebook Pro has 4GB of RAM, and adding a 4GB swapfile allows building the kernel.
@lordcirth unfortunately swap isn't an option here, low-spec HDD which is already busy. Swap solves it on some systems, although it's awkward to manually upgrade, but others simply can't use swap.
It's possible to reduce memory consumption a lot by only including the modules that you use in the baseModules: https://github.com/nixos/nixpkgs/blob/a3f0ef0a1fe3bc4a0d9eb176fcac246634d413c2/nixos/lib/eval-config.nix#L16-L17
Unfortunately nixos-rebuild doesn't support passing that parameter during evaluation. What's possible though is to use the new --flake flag, which allows to precisely control the NixOS evaluation.
The flake.nix would look something like that:
{
outputs = { self, nixpkgs }: {
nixosConfigurations.myhost = import "${nixpkgs}/nixos/lib/eval-config.nix" {
baseModules = [
# import all the modules here
];
modules = [ (import ./myhost/configuration.nix) ];
};
};
}
And then to build, you would use nixos-rebuild --flake .#myhost build.
This should work in theory. It's probably going to take a while to populate the baseModules with minimal requirements.
It's possible to reduce memory consumption a lot by only including the modules that you use in the
baseModules: https://github.com/nixos/nixpkgs/blob/a3f0ef0a1fe3bc4a0d9eb176fcac246634d413c2/nixos/lib/eval-config.nix#L16-L17Unfortunately
nixos-rebuilddoesn't support passing that parameter during evaluation. What's possible though is to use the new--flakeflag, which allows to precisely control the NixOS evaluation.The
flake.nixwould look something like that:{ outputs = { self, nixpkgs }: { nixosConfigurations.myhost = import "${nixpkgs}/nixos/lib/eval-config.nix" { baseModules = [ # import all the modules here ]; modules = [ (import ./myhost/configuration.nix) ]; }; }; }And then to build, you would use
nixos-rebuild --flake .#myhost build.This should work in theory. It's probably going to take a while to populate the
baseModuleswith minimal requirements.
So instead of using lib.nixosSystem, which has all the modules in nixpkgs loaded(?), this creates the same structure, but with an empty list of baseModules for you to populate? Would you just build this repeatedly and add every module who's absence breaks the build?
You got it. It will be quite painful to build the full list as module inter-dependencies are not being tracked, but that's the best (only?) way to reduce memory usage. As NixOS gets more and more services defined, the memory usage keeps growing.
I marked this as stale due to inactivity. → More info
Depends on your nix config of course but mine failed at 1gb ram (and with an additional 1G swap). Swap definitely helped though after i changed it to 4G.
On that note, curious and want to do a mini survey of sorts-- My system has 8GB of RAM and the nixos-rebuild-switch step takes about 30 minutes before printing the list of stuff to build and stuff to download. How does it go with you guys? Please note your RAM and time before the list of things to download prints. Additionally if you can also print the susttained, for me its about 2.5/4 CPUs used and 91-94% RAM used, that would also be nice.
I marked this as stale due to inactivity. → More info
I was just bit by this on a 1GB VPS service. The solution was to add 2GB of swap. That's the easy part, the hard part was trying to figure out why nixos-rebuild wasn't starting my services (answer: it was silently being killed by the OOM killer).
Faced the same issue on a raspberry pi with 1GB RAM when trying to rebuild using a flake-based configuration; adding 2GB swap fixed this for me too.
Just ran into this when building a DigitalOcean droplet, any chance we can get some kind of warning that this is happening? The silent failure is the real pain here.
Getting this same issue when using WSL, had to upgrade the memory used by WSL to 3GB. The part of it silently failing is very annoying,
I have also encountered the same issue when trying to provision a VM on Proxmox. Started with 512MB of memory, 1GB and then 3GB.
I wanted to run NixOS on a small vm that would only run Nginx. But the requirement that the VM needs at least 3GB of ram to do an update is not the best.
I can agree. Using nixos-rebuild switch on a 1GB RAM VPS is impossible without using at least 2GB of swap. The whole system will just freeze. NixOS doesn't seem to be an option for people that want declarative config on a low end VPS.
5 years later with a config large enough, --upgrade managed to throw me into the same problem even on 16 GB RAM, had to do the painful waiting twice before i realized i need to close the browser and let it cook
Ran into this issue after an hour installing and configuring a 1GB VPS. Couldn't figure out why rebuild was not doing anything until I noticed the 127 exit code and the OOM killer logs. Increasing my swapfile from 512mb to 3gb fixed the issue--though I could probably get away with less, since I noticed only a peak of ~800mb swap usage during rebuild.
I would love to see some fix that doesn't involve increasing swap though.