longhorn-manager
longhorn-manager copied to clipboard

Published 20 hours ago •

Reame
Issues

improve logging and status for volume scheduling

Open philippedev101 opened this issue 5 months ago • 4 comments

trafficstars

Which issue(s) this PR fixes:

https://github.com/longhorn/longhorn/issues/10999

What this PR does / why we need it:

Improve the clarity of log messages and volume status conditions related to replica scheduling.

Volume Controller:

Refined reconcileVolumeCondition to produce more precise Scheduled condition messages, avoiding redundancy when replicas are unscheduled.
Detailed scheduler errors are now primarily used for PV annotations and robust Faulted state determination.
Enhanced logs for scheduling attempts (distinguishing retries from persistent failures) and for dropping volumes from the queue.

Replica Scheduler:

Modified scheduling functions to return util.MultiError, aggregating all reasons for scheduling failures for better upstream reporting.
Log messages now include more specific context (volume, replica, node, disk) and detailed reasons for placement decisions.

Testing:

Updated test expectations for VolumeStatus to match new controller behavior.
Added new scheduler tests to verify detailed error reporting.

Special notes for your reviewer:

Have a nice day!

Additional documentation or context

not at the moment

May 30 '25 18:05 philippedev101