spring-cloud-netflix icon indicating copy to clipboard operation
spring-cloud-netflix copied to clipboard

EurekaClientConfigurationRefresher & DiscoveryClient#refreshInstanceInfo concurrent execution

Open yuhuangbin opened this issue 3 years ago • 5 comments

Describe the bug As we know, when RefreshScopeRefreshedEvent is published, service instances perform deregister and register actions. refer: EurekaDiscoveryClientConfiguration deregister and register will call ApplicationInfoManager#setInstanceStatus to modify InstanceStatus.

protected static class EurekaClientConfigurationRefresher
			implements ApplicationListener<RefreshScopeRefreshedEvent> {

                 // ...

		public void onApplicationEvent(RefreshScopeRefreshedEvent event) {
			if (eurekaClient != null) {
				eurekaClient.getApplications();
			}
			if (autoRegistration != null) {
                                // deregister instance
				this.autoRegistration.stop();
				// register instance
				this.autoRegistration.start();
			}
		}
	}

But, the DiscoveryClient will also be modified InstanceStatus by schedule task.

void refreshInstanceInfo() {
            applicationInfoManager.refreshDataCenterInfoIfRequired();
            applicationInfoManager.refreshLeaseInfoIfRequired();

            InstanceStatus status;
            try {
                // get instance status
                status = getHealthCheckHandler().getStatus(instanceInfo.getStatus());
            } catch (Exception e) {
                logger.warn("Exception from healthcheckHandler.getStatus, setting status to DOWN", e);
                status = InstanceStatus.DOWN;
            }

            if (null != status) {
                // modify instance status
                applicationInfoManager.setInstanceStatus(status);
            }
       
    }

So, EurekaClientConfigurationRefresher & DiscoveryClient#refreshInstanceInfo concurrent execution causes concurrency issues

Step1 deregister instance then modify InstanceStatus to DOWN by EurekaClientConfigurationRefresher

// deregister instance
this.autoRegistration.stop();

Step2 get InstanceStatus from instanceInfo is DOWN by DiscoveryClient#refreshInstanceInfo()

// get instance status
status = getHealthCheckHandler().getStatus(instanceInfo.getStatus());

Step3 deregister instance then modify InstanceStatus to UP by EurekaClientConfigurationRefresher

// register instance
this.autoRegistration.start();

Step4 modify InstanceStatus to DOWN by DiscoveryClient#refreshInstanceInfo()

// modify instance status
applicationInfoManager.setInstanceStatus(status);

If the execution order is as above, the service InstanceStatus changes to DOWN.

yuhuangbin avatar May 17 '22 05:05 yuhuangbin

It is recommended that the operation of querying the shared variable first and then updating it be locked here, and this problem has not been reproduced after the lock is currently added. @OlgaMaciaszek Can you give a little advice?

    void refreshInstanceInfo() {
        
        applicationInfoManager.refreshDataCenterInfoIfRequired();
        applicationInfoManager.refreshLeaseInfoIfRequired();
        // need lock here
        synchronized (applicationInfoManager) {
            InstanceStatus status;
            try {
                status = getHealthCheckHandler().getStatus(instanceInfo.getStatus());
            } catch (Exception e) {
                logger.warn("Exception from healthcheckHandler.getStatus, setting status to DOWN", e);
                status = InstanceStatus.DOWN;
            }

            if (null != status) {
                applicationInfoManager.setInstanceStatus(status);
            }
        }
    }

yuhuangbin avatar May 19 '22 01:05 yuhuangbin

Hello, @yuhuangbin, thanks for reporting the issue. Will take a look next week.

OlgaMaciaszek avatar May 19 '22 08:05 OlgaMaciaszek

Hello, @OlgaMaciaszek , Is there a conclusion to this issue?

yuhuangbin avatar May 27 '22 10:05 yuhuangbin

Looks like a bug.

OlgaMaciaszek avatar Sep 01 '22 11:09 OlgaMaciaszek