Dragonfly2
Dragonfly2 copied to clipboard
dfdaemon is connecting to an incorrect scheduler domain name and IP address.
Version Information
helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
dragonfly dragonfly 29 2025-05-06 17:43:30.005192 +0800 CST deployed dragonfly-1.1.32 2.1.31
bug The dfdaemon component has an incorrect connection to the FQDN of the scheduler.
kubectl logs -f dragonfly-dfdaemon-zwgnd|grep -i error
2025-05-07T05:33:21.097Z WARN config/dynconfig_manager.go:132 scheduler host address dragonfly-scheduler-2.scheduler.dragonfly.svc.wlcb.in.openbayes.com:8002 is unreachable: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp: lookup dragonfly-scheduler-2.scheduler.dragonfly.svc.wlcb.in.openbayes.com on 10.97.0.10:53: no such host"
As far as I know, only headless type services can use the FQDN in the following format.
<pod-name>.<service-name>.<namespace>.svc.cluster.local
However, the default type of the scheduler svc is not headless.
kubectl get svc|grep dragonfly-scheduler
dragonfly-scheduler ClusterIP 10.97.104.238 <none> 8002/TCP 350d
And the IP address is also incorrect.
kubectl get pod -o wide|grep dragonfly-scheduler-0
dragonfly-scheduler-0 1/1 Running 0 3m16s 10.96.6.213 titan-v1 <none> <none>
2025-05-07T06:49:08.280Z WARN config/dynconfig_manager.go:142 scheduler 10.96.6.243 dragonfly-scheduler-0.scheduler.dragonfly.svc.bj.in.openbayes.com 8002 has not reachable addresses
2025-05-07T06:49:38.279Z WARN config/dynconfig_manager.go:142 scheduler 10.96.6.243 dragonfly-scheduler-0.scheduler.dragonfly.svc.bj.in.openbayes.com 8002 has not reachable addresses
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).GetResolveSchedulerAddrs
/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:142
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).ResolveNow
/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:82
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).OnNotify
/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:109
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Notify
/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:242
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Serve
/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:268
d7y.io/dragonfly/v2/client/daemon.(*clientDaemon).Serve.func10
/go/src/d7y.io/dragonfly/v2/client/daemon/daemon.go:744
golang.org/x/sync/errgroup.(*Group).Go.func1
/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78
2025-05-07T06:49:38.279Z ERROR grpclog/grpclog.go:55 [scheduler_resolver]resolve addresses error can not found available scheduler addresses
google.golang.org/grpc/internal/grpclog.ErrorDepth
/go/pkg/mod/google.golang.org/[email protected]/internal/grpclog/grpclog.go:55
google.golang.org/grpc/grpclog.(*componentData).ErrorDepth
/go/pkg/mod/google.golang.org/[email protected]/grpclog/component.go:46
google.golang.org/grpc/grpclog.(*componentData).Errorf
/go/pkg/mod/google.golang.org/[email protected]/grpclog/component.go:79
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).ResolveNow
/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:84
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).OnNotify
/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:109
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Notify
/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:242
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Serve
/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:268
d7y.io/dragonfly/v2/client/daemon.(*clientDaemon).Serve.func10
/go/src/d7y.io/dragonfly/v2/client/daemon/daemon.go:744
golang.org/x/sync/errgroup.(*Group).Go.func1
/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78
@zsksy123 Manager needs to clean the dirty data. I think this is a feature. PR is welcome!