sake icon indicating copy to clipboard operation
sake copied to clipboard

Can't access libvirt hosts via dns name

Open rktjmp opened this issue 3 years ago • 11 comments

  • [X] I have the latest version of sake
  • [X] I have searched through the existing issues

Info

  • OS

    • [X] Linux
  • Shell

    • [X] Zsh
  • Version: Version: 0.12.1 Commit: ada7097 Date: 2022-10-16T06:28:36Z

Problem / Steps to reproduce

I have some VMs running under libvirt, which are normally accessible via their host name, but sake can't resolve them. Resolution does work in other tools (ping, pyinfra, ansible, ssh, etc) so I don't think it's a general configuration error.

servers:
  localhost:
    host: 0.0.0.0
    local: true
  vm:
    host: host1vm

tasks:
  ping:
    desc: Pong
    cmd: echo "pong"
λ sake run ping --all


 Unreachable Hosts

 server | host    | user | port | error
--------+---------+------+------+---------------------------------------------------------------
 vm     | host1vm | soup | 22   | dial tcp: lookup host1vm on 127.0.0.53:53: server misbehaving

λ ping host1vm
PING host1vm (192.168.122.244) 56(84) bytes of data.
64 bytes from 192.168.122.244 (192.168.122.244): icmp_seq=1 ttl=64 time=0.141 ms
^C64 bytes from 192.168.122.244: icmp_seq=2 ttl=64 time=0.132 ms

You can use https://github.com/rktjmp/virt-up to bring up named hosts, but it does require some setup.

rktjmp avatar Oct 21 '22 09:10 rktjmp

Raw Go test does work,

package main

import (
        "net"
        "fmt"
        "os"
)

func main() {
        ips, err := net.LookupIP("host1vm")
        if err != nil {
                fmt.Fprintf(os.Stderr, "Could not get IPs: %v\n", err)
                os.Exit(1)
        }
        for _, ip := range ips {
                fmt.Printf("host1vm. IN A %s\n", ip.String())
        }
}
λ go run main.go
host1vm. IN A 192.168.122.244

rktjmp avatar Oct 21 '22 10:10 rktjmp

Great find, I need to implement the LookupIP method you provided. Just to be safe, does it work as intended when you paste the IP directly?

alajmo avatar Oct 21 '22 10:10 alajmo

Direct IP works.

servers:
  ip:
    host: 192.168.122.244
λ sake run ping --servers ip

TASK [ping: Pong] ******************************************************************************

192.168.122.244 | pong

rktjmp avatar Oct 21 '22 10:10 rktjmp

What's your local resolver? It works for me without any changes, I have my nameserver set to my pi-hole:

/etc/resolv.conf

local resolver
domain lan
search lan
nameserver 192.168.1.209

Seems sake is resolving to 127.0.0.53:53 in your case, I just want to make sure I can replicate your environment and fix it correctly.

alajmo avatar Oct 21 '22 14:10 alajmo

# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search .
λ resolvectl
Global
           Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
    resolv.conf mode: stub
Fallback DNS Servers: 1.1.1.1#cloudflare-dns.com 9.9.9.9#dns.quad9.net 8.8.8.8#dns.google
                      2606:4700:4700::1111#cloudflare-dns.com 2620:fe::9#dns.quad9.net
                      2001:4860:4860::8888#dns.google

Link 2 (eno1)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.20.1
       DNS Servers: 192.168.20.1

Link 4 (docker0)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 5 (virbr0)
Current Scopes: LLMNR/IPv4
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 45 (br-77f2899f0eb4)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 71 (tap0)
Current Scopes: LLMNR/IPv6
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 72 (tap1)
Current Scopes: LLMNR/IPv6
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 73 (tap2)
Current Scopes: LLMNR/IPv6
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

You might also have to setup libvirt-nss https://libvirt.org/nss.html

rktjmp avatar Oct 21 '22 14:10 rktjmp

FWIW, adjusting the make file as so:

@@ -53,9 +53,9 @@ mock-performance-ssh:
 	cd ./test && docker-compose -f docker-compose-performance.yaml up
 
 build:
-	CGO_ENABLED=0 go build \
+	CGO_ENABLED=1 go build \
 	-ldflags "-s -w -X '${PACKAGE}/cmd.version=${VERSION}' -X '${PACKAGE}/cmd.commit=${GIT}' -X '${PACKAGE}/cmd.date=${DATE}'" \
-	-a -tags netgo -o dist/${NAME} main.go
+	-a -tags netcgo -o dist/${NAME} main.go
 
 build-all:
 	goreleaser release --skip-publish --rm-dist --snapshot
@@ -63,7 +63,7 @@ build-all:
 build-and-link:
 	go build \
 		-ldflags "-w -X '${PACKAGE}/cmd.version=${VERSION}' -X '${PACKAGE}/cmd.commit=${GIT}' -X '${PACKAGE}/cmd.date=${DATE}'" \
-		-a -tags netgo -o dist/${NAME} main.go
+		-a -tags netcgo -o dist/${NAME} main.go
 	cp ./dist/sake ~/.local/bin/sake
 
 gen-man:

Then running

λ GODEBUG=netdns='cgo+9' ./sake run ping --servers vm
go package net: confVal.netCgo = true  netGo = false
go package net: using cgo DNS resolver
go package net: hostLookupOrder(host1vm) = cgo

TASK [ping: Pong] ***************************************************************

host1vm | pong

Related but unhelpful (I'm not running alpine, just regular archlinux-amd64): https://groups.google.com/g/golang-nuts/c/G-faJ0bthz0

Not a Go dev, my very cursory understanding is netcgo will try to call the system resolver instead of Go's own and perhaps Go's own implementation wont recurse by default or whatever.

Not sure if it's reasonable to just perform the net.LookupIP before calling ssh.Dial. You'd think dial would just use that internally but perhaps not.

Actually, it's possible the working test above defaults to netcgo?

rktjmp avatar Oct 22 '22 08:10 rktjmp

Infact it seems it does:

λ GODEBUG=netdns=9 go run main.go
go package net: confVal.netCgo = false  netGo = false
go package net: dynamic selection of DNS resolver
go package net: hostLookupOrder(host1vm) = cgo
host1vm. IN A 192.168.122.60

rktjmp avatar Oct 22 '22 08:10 rktjmp

May not be solvable without cgo?

https://news.ycombinator.com/item?id=17799874 (2018, but the code is largely unchanged in master)

Here's the code that determines if it needs to fall back to cgo: https://github.com/golang/go/blob/161874da2ab6d5372043a1f393...

Notably, it can only handle the following sources: files, dns, myhostname, mdns* (but only a subset of those if you read the gory details).

It doesn't handle the fairly uncommon "mymachines" or "resolve" (sometimes used in the systemd world nowadays)

λ cat /etc/nsswitch.conf
# Name Service Switch configuration file.
# See nsswitch.conf(5) for details.

passwd: files systemd
group: files [SUCCESS=merge] systemd
shadow: files systemd
gshadow: files systemd

publickey: files

hosts: mymachines libvirt libvirt_guest resolve [!UNAVAIL=return] files myhostname dns
networks: files

protocols: files
services: files
ethers: files
rpc: files

netgroup: files

It's reasonable to close this as an unsupported usecase if you felt so.

rktjmp avatar Oct 22 '22 09:10 rktjmp

It seems you can set the resolver in the Dial function https://stackoverflow.com/questions/30043248/why-golang-lookup-function-cant-provide-a-server-parameter, I will investigate this a bit further (cheers for all the investigation you did), but it would be a shame not to be able to use sake in these situations when you have virtual machines locally, especially if it already works with established software (ssh, pyinfra, ansible, etc.). Obviously making it work automatically would be the best, but perhaps a user option could be used as a last resort.

Ideally, I'd want to avoid cgo, you could make different builds (with/without cgo) but it isn't pretty IMO.

alajmo avatar Oct 22 '22 10:10 alajmo

Yeah I did wonder if pulling in a more complete dns library would be the only real fix. Happy to test a branch if you decide to go that way.

I can just use the IP addresses to hit the virtual machines but it's a bit of a bore changing them each time.

I do wonder if you'd end up having the same issue pop up with the default dns in other cases since its focus is allegedly narrow, but any non cgo/glibc resovler wont be able to use funny things people might put in nsswitch so :shrug:

rktjmp avatar Oct 22 '22 11:10 rktjmp

I've read a bit more and it seems:

  • Couldn't find any 3rd party libraries to resolve this
  • CGO_ENABLED is set to 1 by default
  • It only affects user/net (and if you're using any 3rd party libraries that rely on C code), specifically DNS lookup and user lookup, where it will use cgo resolver instead (depending on some variables, it will default to go resolver in some situations)
  • Setting CGO_ENABLED=1, users can override the DNS resolver at runtime using GODEBUG=netdns='go', but not vice versa (can't override when built with CGO_ENABLED=0
  • The go resolver won't ever reach feature parity with glibc resolvers, which is quite understandable
  • There could be issues when building on Alpine for other platforms

Anyway, I will set CGO_ENABLED=1, as it is by default, that way there's a workaround if you want to use the go resolver (creating an alias alias sake=GODEBUG=netdns=go sake) and it will have the same behavior as other established software + 0.5 MB size decrease.

alajmo avatar Oct 23 '22 13:10 alajmo