celeborn
celeborn copied to clipboard
[CELEBORN-1627] Introduce `instance` variable for celeborn dashboard to filter metrics
What changes were proposed in this pull request?
- add
instanceLabelin metrics source, preferFQDN:portthanip:porteven withceleborn.network.bind.preferIpAddress=falsebefore - add variable
instancewithlabel_values(metrics_JVMCPUTime_Value, instance)same asceleborn-jvm-dashboard.json - add filter
instance=~"${instance}"for every metrics - add missing
legendFormatfor memory file storage metrics expressions
Why are the changes needed?
There should be too many celeborn instances in production use case, it is better to add filter with instance.
Does this PR introduce any user-facing change?
Yes. introduce new variable.
But the instance default value is ALL, same behavior as before.
How was this patch tested?
Config: celeborn.network.bind.preferIpAddress=false
For JVM metrics, before it was ip:port, and now it is FQDN:port.
TODO:
Use {{instance}} as default baseLegend and add more labels for metrics likes:
metrics_FlushWorkingQueueSize_Value -> $baseLegend mountpoint={{mountpoint}}
metrics_DeviceOSFreeBytes_Value -> $baseLegend device={{device}}
metrics_DeviceCelebornFreeBytes_Value -> $baseLegend device={{device}}
Thanks. Merged into main(v.0.6.0).