longhorn-manager
longhorn-manager copied to clipboard
fix(setting): check if the settings are synchronized first.
Which issue(s) this PR fixes:
Issue longhorn/longhorn#7797
What this PR does / why we need it:
Check if the settings are synchronized first instead of checking if volumes are attached first to eliminate unnecessary error logs.
Special notes for your reviewer:
Additional documentation or context
@PhanLe1010 @derekbit Please help review this.
Right now, it will still log even if a particular data engine is disabled even though we already cleanup the corresponding IM and there is no changes in the system.
@PhanLe1010,
Could you describe the Longhorn version, the steps to reproduce it, and show the logs you found? Is the log like this one?
E0127 15:17:07.376434 1 setting_controller.go:208] failed to sync setting for longhorn-system/guaranteed-instance-manager-cpu: current state prevents this: failed to apply guaranteed-instance-manager-cpu setting to Longhorn components when there are attached volumes. It will be eventually applied
Because we depend on the function AreAllVolumesDetached(here) to check if all volumes are detached, it should not return an error (and log) if there is no instance manager for a particular data engine.
Right now, it will still log even if a particular data engine is disabled even though we already cleanup the corresponding IM and there is no changes in the system.
@PhanLe1010,
Could you describe the Longhorn version, the steps to reproduce it, and show the logs you found? Is the log like this one?
E0127 15:17:07.376434 1 setting_controller.go:208] failed to sync setting for longhorn-system/guaranteed-instance-manager-cpu: current state prevents this: failed to apply guaranteed-instance-manager-cpu setting to Longhorn components when there are attached volumes. It will be eventually appliedBecause we depend on the function
AreAllVolumesDetached(here) to check if all volumes are detached, it should not return an error (and log) if there is no instance manager for a particular data engine.
Sorry for the late response. So with this PR, we have:
- When we sync the setting
SettingNameV2DataEngine, we enter this line of code https://github.com/mantissahz/longhorn-manager/blob/a9bff9d1a3ddc496808cc13d7989bc6efed839cf/controller/setting_controller.go#L985-L986 - The above line of code check if the
dataEngineEnabledis enable or not. Assume that the setting is never changed and it is always isfalse. In this case, we are still printing out the log https://github.com/mantissahz/longhorn-manager/blob/a9bff9d1a3ddc496808cc13d7989bc6efed839cf/datastore/longhorn.go#L419-L420 if there are attached volumes. - The end result is that we will print out that log every 30s even though there is nothing change in the system/setting
WDYT about this concern?
@PhanLe1010,
The above line of code check if the dataEngineEnabled is enable or not. Assume that the setting is never changed and it is always is false.
- The setting is always false after a fresh install, and it would have any volumes attached. I think it will not print the logs or do you think there will be an error returned from https://github.com/mantissahz/longhorn-manager/blob/a9bff9d1a3ddc496808cc13d7989bc6efed839cf/datastore/longhorn.go#L419-L420?
- The setting is enabled and some volumes are attached and then disable the setting. It should be printing logs.
In this case, we are still printing out the log https://github.com/mantissahz/longhorn-manager/blob/a9bff9d1a3ddc496808cc13d7989bc6efed839cf/datastore/longhorn.go#L419-L420 if there are attached volumes.
Do you mean we are printing out the log https://github.com/mantissahz/longhorn-manager/blob/a9bff9d1a3ddc496808cc13d7989bc6efed839cf/datastore/longhorn.go#L423-L425 when volumes are attached with dataEngineEnabled false?
or Do you mean that when dataEngineEnabled is false and https://github.com/mantissahz/longhorn-manager/blob/a9bff9d1a3ddc496808cc13d7989bc6efed839cf/datastore/longhorn.go#L417 will return an error?
- I think we should have the log when volumes are attached with
dataEngineEnabledfalse, right? - The log level is
Trace(ErrorIsInvalidState) like other danger zone settings.
Do you mean we are printing out the log https://github.com/mantissahz/longhorn-manager/blob/a9bff9d1a3ddc496808cc13d7989bc6efed839cf/datastore/longhorn.go#L423-L425 when volumes are attached with dataEngineEnabled false?
Hi @mantissahz , this is what I meant. We are printing out the logs when there is nothing change in the Longhorn system. Specifically, we are printing out the log when:
- There are attached volumes and they have not changed. They are still attached
- Setting dataEngineEnabled is false and it has not changed. It is still false.
I think the purpose of this PR is that we should not print out the duplicated logs so we should not have the above behavior. WDYT?
After discussing with @mantissahz , we think that it is difficult to achieve the behavior at https://github.com/longhorn/longhorn-manager/pull/2539#issuecomment-2043939546. We can keep the current implementation for now
@Mergify backport v1.6.x
backport v1.6.x
✅ Backports have been created
- #2730 fix(setting): check if the settings are synchronized first. (backport #2539) has been created for branch
v1.6.x