volcano metrics: use milliseconds instead of microseconds as the time unit for scheduling latency

The current time unit cannot accurately reflect the actual scheduling time like below

volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="5"} 0
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="10"} 0
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="20"} 0
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="40"} 0
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="80"} 0
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="160"} 0
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="320"} 1
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="640"} 1
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="1280"} 68550
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="2560"} 198253
volcano_action_scheduling_latency_microseconds_bucket{action="allocate",le="+Inf"} 243524
volcano_action_scheduling_latency_microseconds_sum{action="allocate"} 5.505664439029926e+08

Signed-off-by: Liang Zheng [email protected]

Jun 27 '24 09:06 microyahoo

Welcome @microyahoo!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. :smiley:

Jun 27 '24 09:06 volcano-sh-bot

Shall we mark it deprecated first and then remove it after several releases?

Jun 27 '24 13:06 lowang-bh

Shall we mark it deprecated first and then remove it after several releases?

hi @lowang-bh, thanks for your quick response. There is also another workaround that doesn't introduce a breaking change; we just need to modify the exponential bucket count to cover a wider range.

Jun 27 '24 14:06 microyahoo

@Monokaix please take a look.

Jun 27 '24 14:06 lowang-bh

What's the problem of using microseconds?

Jul 10 '24 02:07 Monokaix

What's the problem of using microseconds?

hi @Vacant2333, if microseconds are used, the maximum range of the bucket is 2.56ms. Values greater than 2.56ms will all fall into this bucket, which does not reflect the actual scheduling latency.

Jul 11 '24 15:07 microyahoo

What's the problem of using microseconds?

hi @Vacant2333, if microseconds are used, the maximum range of the bucket is 2.56ms. Values greater than 2.56ms will all fall into this bucket, which does not reflect the actual scheduling latency.

I want to know how you @mentioned me; I have no connection to this PR.

Jul 11 '24 15:07 Vacant2333

What's the problem of using microseconds?

hi @Vacant2333, if microseconds are used, the maximum range of the bucket is 2.56ms. Values greater than 2.56ms will all fall into this bucket, which does not reflect the actual scheduling latency.

I want to know how you @mentioned me; I have no connection to this PR.

I think he's pointing me: )

Jul 12 '24 01:07 Monokaix

/lgtm

Jul 12 '24 01:07 Monokaix

What's the problem of using microseconds?

hi @Vacant2333, if microseconds are used, the maximum range of the bucket is 2.56ms. Values greater than 2.56ms will all fall into this bucket, which does not reflect the actual scheduling latency.

I want to know how you @mentioned me; I have no connection to this PR.

sorry, my mistake @Vacant2333

Jul 12 '24 02:07 microyahoo

hi @lowang-bh @hudson741 @Thor-wl, PTAL, thanks.

Jul 12 '24 02:07 microyahoo

/lgtm

Jul 14 '24 06:07 lowang-bh

/ok-to-test

Jul 14 '24 06:07 lowang-bh

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: william-wang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/scheduler/OWNERS~~ [william-wang]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Aug 07 '24 01:08 volcano-sh-bot