nixos-infect
nixos-infect copied to clipboard
Loosing network on OVH VPS
I tried on a VPS 2016 SSD 3 from OVH, each of these :
- Debian 8
- Debian 9
- Ubuntu 16.04
And each time I run the script, then installation + reboot are going well. But after that, I can't connect to the VPS anymore. I accessed it by KVM to debug and it seems that the VPS just get disconnected from network.
The network detection part of the script is not very robust and assumes digitalocean's idiosyncrasies. Likely , it didn't grab the right settings for your host. I'll probably improve it sometime in the future, but for now, try the following to manually provision your hosts:
(if you have console access to an already-provisioned host, do only step 5, instead using the ip info provided in the OVH web UI, and then nixos-rebuild switch)
- copy over the nixos-infect script
- edit it
- comment out the last 4 lines (
makeSwaptoreboot) - run
source nixos-infect; set +e. This will generate the config files but not try to install everything yet. - edit /etc/nixos/networking.nix, correcting any obvious errors. Use commands
ip addr,ip route, andcat /etc/resolv.confto obtain any missing info. Probably remove eth1 entirely. - edit nixos-infect again, and uncomment the lines you commented
bash -x nixos-infect
Post me the networking.nix contents, if you still can't get it working.
Yeah, for OVH, don't even bother with the networking, just rip it out completely.
Change this:
imports = [
./hardware-configuration.nix
./networking.nix # generated at runtime by nixos-infect
$NIXOS_IMPORT
];
to this:
imports = [
./hardware-configuration.nix
$NIXOS_IMPORT
];
I just successfully installed doing that. Debian 9
I should probably add a --no-networking option that does this, or detect when the original system's network uses dhcp instead of manual config.
Is this still an issue, since we do this?
I'll try to check it out, then tell you if it's okay now.
@asymmetric Yes, this is still an issue.
I have big issues to make it run on master, does any of you (@asymmetric, @charlycoste, @kniteli ) would have a functioning version ? I could take care of piggy-backing on it to set up a PR for master to function on OVH.
Don't use OVH, sorry.
@TheSirC if you still need help, attach a .log here with the output after you set -x and run the script (comment out the reboot), because I have no detail on what your "big issues" are.
@elitak Yes, of course. Here is the error-log :
Error log
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 11.4978 s, 93.4 MB/s
swapon: /tmp/nixos-infect.rbzUv.swp: found swap signature: version 1d, page-size 4, same byte order
swapon: /tmp/nixos-infect.rbzUv.swp: pagesize=4096, swapsize=1073741824, devsize=1073741824
The problem lies in the fact that even with reboot command on the system does not reboot and the instance is not provided with usual NixOS commands (nix-env, nixos-rebuild, etc).
I "bisected" that execution arrive to the makeConf function and just "don't execute it" (I can not find any trace of the commands in there leaving any traces on the system). The script exits with error code 1.
Run set -x, before you run the script, for more detail; that should get you the exact line that fails.
I actually added it to the script itself without further output. I added to the interactive session with this output :
root@address:~# ./nixos-infect
+ ./nixos-infect
And immediately returning to the interactive prompt.
After further testing (and multiple reinstalls of the VPS to make sure to work on a clean system each time) I found that :
- the script is stopping here, on the grep part; running the command myself sends back an empty string, totally normal the file is empty but does exist ! (that is considered as a fail for grep: it sends back error code 1).
- Here the script does not include a refresh of the packages list (e.g.
apt-get update) which can make it fail.
Fun fact : The following commands do not produce the same output and I really would like to know why :
bash -x script(<-- I ran this one to have output on the script)- adding
set -xto the script after the shebang - doing
set -xin your interactive prompt and then running./script
So after applying patches for the above-cited issues I opened a pull-request that worked for me on OVH.
Resolved since 3 years....
Anyway i've similar problem on different budget provider. It's kinda weird -> vps does respond to ping but seems that it have all ports closed. Any ideas what's the problem?
Anyway if i've enought time, i'll try do step by step what this script does and play a little with config. Maybe #61 is the solution?