edgemesh icon indicating copy to clipboard operation
edgemesh copied to clipboard

EdgeMesh support retries when call microservice

Open ZBoIsHere opened this issue 3 years ago • 2 comments

What would you like to be added/modified: EdgeMesh support the configuration of the maximum retries times when call microservice Why is this needed: When accessing microservices, the request often fails for some reasons, so it needs to be retried. In microservice governance, the logic of retry should be completed by the microservice governance framework, namely EdgeMesh, so EdgeMesh is required to configure the microservice Maximum number of failed attempts

ZBoIsHere avatar Dec 08 '21 07:12 ZBoIsHere

@Poorunga 微服务访问失败重试的能力,有计划提供支持吗?

ZBoIsHere avatar Feb 18 '22 07:02 ZBoIsHere

@ZBoIsHere 微服务在源端建立连接的失败重试机制: https://github.com/kubeedge/edgemesh/blob/main/third_party/forked/kubernetes/pkg/proxy/userspace/proxysocket.go#L91-L118

func TryConnectEndpoints(service proxy.ServicePortName, srcAddr net.Addr, tcpConn *net.TCPConn, protocol string, loadBalancer LoadBalancer) (out io.ReadWriteCloser, err error) {
	sessionAffinityReset := false
	for _, dialTimeout := range EndpointDialTimeouts {
		endpoint, req, err := loadBalancer.NextEndpoint(service, srcAddr, tcpConn, sessionAffinityReset)
		if err != nil {
			klog.ErrorS(err, "Couldn't find an endpoint for service", "service", service)
			return nil, err
		}
		klog.V(3).InfoS("Mapped service to endpoint", "service", service, "endpoint", endpoint)
		outConn, err := TryDialStream(protocol, endpoint, dialTimeout)
		if err != nil {
			if util.IsTooManyFDsError(err) {
				panic("Dial failed: " + err.Error())
			}
			klog.ErrorS(err, "Dial failed")
			sessionAffinityReset = true
			continue
		}
		if req != nil {
			reqBytes, err := util.HttpRequestToBytes(req)
			if err == nil {
				outConn.Write(reqBytes)
			}
		}
		return outConn, nil
	}
	return nil, fmt.Errorf("failed to connect to an endpoint")
}

微服务在目的端建立代理连接的失败重试机制: https://github.com/kubeedge/edgemesh/blob/main/agent/pkg/tunnel/proxy/proxy.go#L94-L125

func (ps *ProxyService) TryConnectEndpoint(msg *pb.Proxy) (net.Conn, error) {
	var err error
	switch msg.GetProtocol() {
	case "tcp":
		for i := 0; i < MaxRetryTime; i++ {
			tcpConn, err := net.DialTCP("tcp", nil, &net.TCPAddr{
				IP:   net.ParseIP(msg.GetIp()),
				Port: int(msg.GetPort()),
			})
			if err == nil {
				return tcpConn, nil
			}
			time.Sleep(time.Second)
		}
		klog.Errorf("max retries for dial")
		return nil, err
	case "udp":
		for i := 0; i < MaxRetryTime; i++ {
			udpConn, err := net.DialUDP("udp", nil, &net.UDPAddr{
				IP:   net.ParseIP(msg.GetIp()),
				Port: int(msg.GetPort()),
			})
			if err == nil {
				return udpConn, nil
			}
		}
		klog.Errorf("max retries for dial")
		return nil, err
	default:
		return nil, fmt.Errorf("unsupported protocol: %s", msg.GetProtocol())
	}
}

每次拨号失败,都会休眠几秒,再重新拨号,超过最大重试次数后失败。这是否是您想要的微服务访问失败重试的能力呢?

Poorunga avatar Feb 19 '22 08:02 Poorunga