Dragonfly2 icon indicating copy to clipboard operation
Dragonfly2 copied to clipboard

dfdaemon is connecting to an incorrect scheduler domain name and IP address.

Open zsksy123 opened this issue 8 months ago • 1 comments

Version Information

helm list
NAME     	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART           	APP VERSION
dragonfly	dragonfly	29      	2025-05-06 17:43:30.005192 +0800 CST	deployed	dragonfly-1.1.32	2.1.31

bug The dfdaemon component has an incorrect connection to the FQDN of the scheduler.

kubectl  logs -f dragonfly-dfdaemon-zwgnd|grep -i error

2025-05-07T05:33:21.097Z	WARN	config/dynconfig_manager.go:132	scheduler host address dragonfly-scheduler-2.scheduler.dragonfly.svc.wlcb.in.openbayes.com:8002 is unreachable: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp: lookup dragonfly-scheduler-2.scheduler.dragonfly.svc.wlcb.in.openbayes.com on 10.97.0.10:53: no such host"

As far as I know, only headless type services can use the FQDN in the following format. <pod-name>.<service-name>.<namespace>.svc.cluster.local However, the default type of the scheduler svc is not headless.

kubectl get svc|grep dragonfly-scheduler
dragonfly-scheduler           ClusterIP   10.97.104.238   <none>        8002/TCP             350d

And the IP address is also incorrect.

kubectl  get pod -o wide|grep dragonfly-scheduler-0
dragonfly-scheduler-0                1/1     Running   0             3m16s   10.96.6.213     titan-v1   <none>           <none>
2025-05-07T06:49:08.280Z	WARN	config/dynconfig_manager.go:142	scheduler 10.96.6.243 dragonfly-scheduler-0.scheduler.dragonfly.svc.bj.in.openbayes.com 8002 has not reachable addresses
2025-05-07T06:49:38.279Z	WARN	config/dynconfig_manager.go:142	scheduler 10.96.6.243 dragonfly-scheduler-0.scheduler.dragonfly.svc.bj.in.openbayes.com 8002 has not reachable addresses
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).GetResolveSchedulerAddrs
	/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:142
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).ResolveNow
	/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:82
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).OnNotify
	/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:109
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Notify
	/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:242
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Serve
	/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:268
d7y.io/dragonfly/v2/client/daemon.(*clientDaemon).Serve.func10
	/go/src/d7y.io/dragonfly/v2/client/daemon/daemon.go:744
golang.org/x/sync/errgroup.(*Group).Go.func1
	/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78
2025-05-07T06:49:38.279Z	ERROR	grpclog/grpclog.go:55	[scheduler_resolver]resolve addresses error can not found available scheduler addresses
google.golang.org/grpc/internal/grpclog.ErrorDepth
	/go/pkg/mod/google.golang.org/[email protected]/internal/grpclog/grpclog.go:55
google.golang.org/grpc/grpclog.(*componentData).ErrorDepth
	/go/pkg/mod/google.golang.org/[email protected]/grpclog/component.go:46
google.golang.org/grpc/grpclog.(*componentData).Errorf
	/go/pkg/mod/google.golang.org/[email protected]/grpclog/component.go:79
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).ResolveNow
	/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:84
d7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).OnNotify
	/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:109
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Notify
	/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:242
d7y.io/dragonfly/v2/client/config.(*dynconfigManager).Serve
	/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:268
d7y.io/dragonfly/v2/client/daemon.(*clientDaemon).Serve.func10
	/go/src/d7y.io/dragonfly/v2/client/daemon/daemon.go:744
golang.org/x/sync/errgroup.(*Group).Go.func1
	/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78

zsksy123 avatar May 07 '25 05:05 zsksy123

@zsksy123 Manager needs to clean the dirty data. I think this is a feature. PR is welcome!

gaius-qi avatar May 13 '25 02:05 gaius-qi