net: Resolver doesn't use provided Dial function in all cases
What version of Go are you using (go version)?
$ go version go version go1.20.3 linux/arm64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE="" GOARCH="arm64" GOBIN="" GOCACHE="/tmp/go" GOENV="/home/ubuntu/.config/go/env" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="arm64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/usr/local/lib/go/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/usr/local/lib/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/lib/go-1.20" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/lib/go-1.20/pkg/tool/linux_arm64" GOVCS="" GOVERSION="go1.20.3" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="0" GOMOD="/dev/null" GOWORK="" CGO_CFLAGS="-O2 -g" CGO_CPPFLAGS="" CGO_CXXFLAGS="-O2 -g" CGO_FFLAGS="-O2 -g" CGO_LDFLAGS="-O2 -g" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1714904807=/tmp/go-build -gno-record-gcc-switches"
What did you do?
The net.Resolver accepts an optional Dial function that says the following:
type Resolver struct {
// Dial optionally specifies an alternate dialer for use by
// Go's built-in DNS resolver to make TCP and UDP connections
// to DNS services. The host in the address parameter will
// always be a literal IP address and not a host name, and the
// port in the address parameter will be a literal port number
// and not a service name.
// If the Conn returned is also a PacketConn, sent and received DNS
// messages must adhere to RFC 1035 section 4.2.1, "UDP usage".
// Otherwise, DNS messages transmitted over Conn must adhere
// to RFC 7766 section 5, "Transport Protocol Selection".
// If nil, the default dialer is used.
Dial func(ctx context.Context, network, address string) (Conn, error)
}
I created a script that logs Dial calls when using the pure Go resolver: https://go.dev/play/p/0O_ARZyK2eG
If I run this script locally, I see something like this:
$ ./resolve
Dial(udp, 127.0.0.53:53)
Dial(udp, 127.0.0.53:53)
{172.217.24.46 }
{2404:6800:4006:804::200e }
However, if I run the script with strace, I see that Go is making additional connections some other way:
$ strace ./resolve 2>&1 | grep '^connect'
connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.53")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(9), sin_addr=inet_addr("172.217.24.46")}, 16) = 0
connect(3, {sa_family=AF_INET6, sin6_port=htons(9), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2404:6800:4006:804::200e", &sin6_addr), sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)
There's is one hardcoded call to net.DialUDP here which appears to be the source of the additional connections.
What did you expect to see?
I expect to see the Dial function used for all connections made by the pure Go resolver.
What did you see instead?
I see that the Dial function is only used in some cases.
Additional context
CL 500576 fixes the issue by using net.Resolver.Dial in all cases.
For context, this change is important for targets with limited networking capabilities (e.g. GOOS=wasip1). It means that users can provide their own Dial function to make use of the pure Go resolver. At the moment the hardcoded net.DialUDP call makes the pure Go resolver off limits for these targets.
There was some concern in the CL about whether making this change for all targets would break code in the wild. I'm submitting it as a bug report so we can discuss here instead.
cc GOOS=wasip1 maintainers: @achille-roussel @johanbrandhorst @Pryz
cc those that commented on CL 500576: @mateusz834 @ianlancetaylor
If I replace the hardcoded DialUDP call with r.dial("udp") then the provided Dial function is used in all cases.
-c, err = DialUDP("udp", nil, &dst)
+c, err = r.dial(ctx, "udp", dst.IP.String())
This has the additional benefit of threading the lookup context through to the underlying dialer.
If we're concerned about breaking code in the wild, we could instead opt-in by target, and take this path for GOOS=wasip1 only for now (since it has limited networking capabilities, and DialUDP always fails).
This approach was suggested by @mateusz834:
if runtime.GOOS == "wasip1" {
c, err = r.dial(ctx, "udp", dst.IP.String())
} else {
c, err = DialUDP("udp", nil, &dst)
}
@ianlancetaylor suggested that we might instead require an additional hook:
type Resolver struct {
Dial func(ctx context.Context, network, address string) (Conn, error)
// Extra hook:
DialUDP func(ctx context.Context, network, address string) (Conn, error)
}
or something like this:
type Resolver struct {
Dial func(ctx context.Context, network, address string) (Conn, error)
// Extra hook:
UDPConnect func(ctx context.Context, *UDPAddr) (*UDPAddr, bool)
}
Change https://go.dev/cl/500576 mentions this issue: net: prefer Resolver.Dial over DialUDP on wasip1
The runtime.GOOS == "wasip1" guard was just a simple fix idea, but I agree with @ianlancetaylor that having a per platform behaviour in this case is not ideal.
I think that this hook should be named something like IsAddrReachable, so that the intention is clear.
And probably it should use the netip.Addr at this point.
type Resolver struct {
// IsAddrReachable is used for address sorting by the go resolver.
// When this field is equal to nil, the default dialer is being used. addr is considered reachable,
// when the default dialer sucesfully establishes a UDP connection to addr.
IsAddrReachable func(ctx context.Context, addr netip.Addr) (local netip.Addr, reachable bool)
}
CL 502315 improved the situation for wasip1 by addressing the panic in net.DialUDP. Since it no longer panics, an error from the hardcoded call only affects the sort order.
What would be an option to have dns resolution in chrome with wasip1?
Hey folks, I got some time and decided to create a repro that can help with fixing this issue on other OSes like MacOS aka Darwin using ktrace since strace is only for Linux, and for example
//go:build darwin
// +build darwin
package main
import (
"bytes"
"context"
"fmt"
"os"
"os/exec"
"path/filepath"
"regexp"
"strings"
"syscall"
"time"
)
const progGo = `
package main
import (
"context"
"fmt"
"net"
"os/signal"
"syscall"
"time"
)
func main() {
var d net.Dialer
r := net.Resolver{
PreferGo: true,
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
fmt.Printf("Dial(%s, %s)\n", network, address)
return d.DialContext(ctx, network, address)
},
}
ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGIO, syscall.SIGTERM)
defer cancel()
for {
select {
case <-ctx.Done():
return
case <-time.After(1 * time.Second):
ips, err := r.LookupIPAddr(ctx, "google.com")
if err != nil {
panic(err)
}
for _, ip := range ips {
fmt.Println(ip)
}
}
}
}
`
func main() {
tmpDir, err := os.MkdirTemp("", "60712")
if err != nil {
panic(err)
}
defer os.RemoveAll(tmpDir)
path := filepath.Join(tmpDir, "outf.go")
if err := os.WriteFile(path, []byte(progGo), 0755); err != nil {
panic(err)
}
binaryPath := filepath.Join(tmpDir, "ourbin")
ctx := context.Background()
if err := exec.CommandContext(ctx, "go", "build", "-o", binaryPath, path).Run(); err != nil {
panic(err)
}
cmd := exec.CommandContext(ctx, binaryPath)
if err := cmd.Start(); err != nil {
panic(err)
}
ktraceCmd := exec.CommandContext(ctx, "sudo", "ktrace", "trace", "-p", fmt.Sprintf("%d", cmd.Process.Pid))
stdout := new(bytes.Buffer)
ktraceCmd.Stdout = stdout
if err := ktraceCmd.Start(); err != nil {
println(stdout.String())
panic(err)
}
<-time.After(5 * time.Second)
if err := cmd.Process.Signal(syscall.SIGTERM); err != nil {
panic(err)
}
if err := ktraceCmd.Process.Signal(syscall.SIGTERM); err != nil {
panic(err)
}
regConnect := regexp.MustCompile(".*connect.*")
if matches := regConnect.FindAllString(stdout.String(), -1); len(matches) != 0 {
println("Found connect like syscall invocations")
// The header is at the first line.
if i := strings.Index(stdout.String(), "\n"); i >= 0 {
println(stdout.String()[:i])
}
for _, match := range matches {
println(match)
}
if err := os.WriteFile("ktrace.log", stdout.Bytes(), 0755); err != nil {
panic(err)
}
panic("found credible match, entire written to: ktrace.log")
}
println("no match")
}
which requires to be run as Super User and then prints out
$ sudo go run main.go
Found connect like syscall invocations
walltime delta(us)(duration) debug-id arg1 arg2 arg3 arg4 thread-id cpu process-name(pid)
2025-12-09 23:07:53.705276 EST 1.8 BSC_connect 6 2e7c5f8b612c 10 2e7c5f8c2be8 103530d 4(AP) ourbin(76027)
2025-12-09 23:07:53.705290 EST 1.7 BSC_connect 7 2e7c5f71c04c 10 2e7c5f8bcbe8 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:53.705304 EST 7.5(27.8) BSC_connect 0 0 0 128fb 103530d 4(AP) ourbin(76027)
2025-12-09 23:07:53.705309 EST 0.6(19.5) BSC_connect 0 0 0 128fb 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:54.002628 EST 1.7 BSC_connect 6 2e7c5f8b419c 1c 2e7c5f8c1148 103530c 0(AP) ourbin(76027)
2025-12-09 23:07:54.002641 EST 5.3(12.4) BSC_connect 41 0 0 128fb 103530c 0(AP) ourbin(76027)
2025-12-09 23:07:54.002659 EST 1.2 BSC_connect 6 2e7c5f8b61ac 10 2e7c5f8c1148 103530c 0(AP) ourbin(76027)
2025-12-09 23:07:54.002674 EST 8.3(14.7) BSC_connect 0 0 0 128fb 103530c 0(AP) ourbin(76027)
2025-12-09 23:07:55.004255 EST 1.9 BSC_connect 7 2e7c5f71c0ac 10 2e7c5f8c0be8 1035321 2(AP) ourbin(76027)
2025-12-09 23:07:55.004255 EST 0.3 BSC_connect 6 2e7c5f92204c 10 2e7c5f920be8 103530c 0(AP) ourbin(76027)
2025-12-09 23:07:55.004287 EST 8.1(32.4) BSC_connect 0 0 0 128fb 1035321 2(AP) ourbin(76027)
2025-12-09 23:07:55.004297 EST 0.2(41.3) BSC_connect 0 0 0 128fb 103530c 0(AP) ourbin(76027)
2025-12-09 23:07:55.010853 EST 0.3 BSC_connect 6 2e7c5f80e81c 1c 2e7c5f91f148 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:55.010886 EST 20.3(33.2) BSC_connect 41 0 0 128fb 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:55.010947 EST 3.4 BSC_connect 6 2e7c5f71c12c 10 2e7c5f91f148 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:55.011003 EST 21.4(56.0) BSC_connect 0 0 0 128fb 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:56.011587 EST 0.2 BSC_connect 7 2e7c5f71c18c 10 2e7c5f91ebe8 1035321 2(AP) ourbin(76027)
2025-12-09 23:07:56.011600 EST 0.3 BSC_connect 6 2e7c5f8b622c 10 2e7c5f91abe8 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:56.011614 EST 12.9(27.1) BSC_connect 0 0 0 128fb 1035321 2(AP) ourbin(76027)
2025-12-09 23:07:56.011622 EST 0.3(21.8) BSC_connect 0 0 0 128fb 103530c 6(AP) ourbin(76027)
2025-12-09 23:07:56.016401 EST 2.4 BSC_connect 6 2e7c5f80e85c 1c 2e7c5f8c3148 103530f 6(AP) ourbin(76027)
2025-12-09 23:07:56.016415 EST 0.2(14.2) BSC_connect 41 0 0 128fb 103530f 6(AP) ourbin(76027)
2025-12-09 23:07:56.016440 EST 1.6 BSC_connect 6 2e7c5f71c20c 10 2e7c5f8c3148 103530f 6(AP) ourbin(76027)
2025-12-09 23:07:56.016466 EST 8.1(25.8) BSC_connect 0 0 0 128fb 103530f 6(AP) ourbin(76027)
panic: found credible match, entire written to: ktrace.log
goroutine 1 [running]:
main.main()
/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/60712/main.go:114 +0x79b
exit status 2
of which BSC_connect is the BSD System Call. When fixed I believe the program which can also be tailored for the program