harvest icon indicating copy to clipboard operation
harvest copied to clipboard

Adaptive policy dashboard

Open rahulguptajss opened this issue 1 year ago • 4 comments

This is a follow up story from #2267 #1426. We should create a adaptive policy panels similiar to QOS fixed % in workload dashboard.

rahulguptajss avatar Nov 21 '23 08:11 rahulguptajss

Requested by Pacqui on discord

rahulguptajss avatar Feb 26 '24 05:02 rahulguptajss

Calculating the Used Percentage of an Adaptive Policy

Let's take a volume as an example. There are two main cases to consider:

1. IOPS Calculation

If an adaptive QoS peak-iops-allocation is allocated_space, the formula to calculate peak-allowed-iops will be:

peak-allowed-iops = max (absolute-min-iops, volume_size(in TB) * peak-iops(in TB))

If peak-iops-allocation is based on used_space, the formula to calculate peak-allowed-iops will be:

peak-allowed-iops = max (absolute-min-iops, volume_used_size(in TB) * peak-iops(in TB))

We'll need to fetch volume_size, volume_used_size for the relevant workload volume mapping. The percentage used can then be calculated as:

% = qos_ops/peak-allowed-iops

2. Throughput Calculation

Adaptive IOPS also applies to throughput. If block-size is ANY then it is not applicable but if Block size is other than that then:

peak-allowed-mbps = (peak-iops * Block Size)  / 1000
qos-mbps = qos_total_data(in mbps)
% = qos-mbps/peak-allowed-mbps

References

https://docs.netapp.com/us-en/ontap/performance-admin/adaptive-qos-policy-groups-task.html https://library.netapp.com/ecmdocs/ECMLP2853092/html/GUID-9019A6AB-5EF6-47A2-8F99-C7D05B77FD9D.html

rahulguptajss avatar Mar 06 '24 06:03 rahulguptajss

Hello @rahulguptajss

First at all thanks for all the information shared in a order and visual way.

I have some doubts related the shared information. In this example that you expose, we have asummed that the adaptative QoS policy is implemented on a volume level but for example in our particular case, we are applying the adaptative QoS in the LUN level. Additionally, we don't use the throughput for limitated the ceiling or floor (probably other clients do this). And the last thing I have seen is we don't have a block size for every specific QoS defined too, in our case the block size is ANY and the Peak and Expected IOPS Allocation are: allocated-space.

But all the information that I have shared is only our use case, so I don't know if all the calculation that you share could be apply and adapt depending on the different way of defining the QoS adaptative policy.

I hope I have explained myself successfully. Thanks.

faguayot avatar Mar 11 '24 17:03 faguayot

Thanks @faguayot , Yes, the write-up above is related to Volume and will be expanded to include Lun as well. Regarding throughput, it is possible that some may be using Block Size, so that's a use case I've highlighted to handle. Given information from your end, your use cases will be covered in the implementation.

rahulguptajss avatar Mar 12 '24 11:03 rahulguptajss

@faguayot The Adaptive Used% panels are available in the nightly release through the Workload dashboard. Please let us know your feedback. Below is a screenshot for reference.

image

rahulguptajss avatar May 03 '24 07:05 rahulguptajss

Hello Rahul,

I have download the nightly from the link that you share but I don't see the same informacion about % of Adaptative QoS.

image

In the QoS bandwidth is normal since we don't configure and use but where it should appear the list of workload, cluster,etc I only see %qos used, IOPS and Max IOPS.

Thanks. It looks great in your image dashboard.

faguayot avatar May 03 '24 08:05 faguayot

Thank you, @faguayot, for the feedback. Could you please re-import the Workload dashboard using this JSON, as shown here?

If this doesn't resolve the issue, we'll investigate the Prometheus query next.

rahulguptajss avatar May 03 '24 09:05 rahulguptajss

I have tried importing the JSON as a new dashboard but the result is the same,

image

faguayot avatar May 03 '24 09:05 faguayot

Okay, let's try checking the Prometheus query results.

Could you share the output of the Prometheus queries below? Please replace "Workload name" with a name from your environment where the adaptive policy is applied. In my example below, the workload name is RahulTest-wid8678.

You can email us the results [email protected]

label_join(
clamp_max(
  (
    (
      qos_ops{workload=~"RahulTest-wid8678"} * 100
    )
    / on(datacenter,cluster,workload,policy_group,volume,lun,svm,qtree,file,wid)
    qos_workload_max_throughput_iops{workload=~"RahulTest-wid8678", is_adaptive="Yes"}
  ),
  100
)
, "unique_id", "-", "datacenter", "cluster", "workload")
image
label_join(
  (
    qos_ops{workload=~"RahulTest-wid8678"}
    * on(datacenter,cluster,workload,policy_group,volume,lun,svm,qtree,file,wid)
    (
      qos_workload_labels{workload=~"RahulTest-wid8678", is_adaptive="Yes"}
      and
      qos_workload_max_throughput_iops{workload=~"RahulTest-wid8678", is_adaptive="Yes"}
    )
  ),
  "unique_id",
  "-",
  "datacenter",
  "cluster",
  "workload"
)
image
label_join(
  qos_workload_max_throughput_iops{
    workload=~"RahulTest-wid8678",
    is_adaptive="Yes"
  },
  "unique_id",
  "-",
  "datacenter",
  "cluster",
  "workload"
)
image

rahulguptajss avatar May 03 '24 09:05 rahulguptajss

Could you share the version of Grafana you are using? Also, if you edit this table, could you check whether Filter by name has the columns selected as mentioned below?

image

rahulguptajss avatar May 03 '24 10:05 rahulguptajss

@faguayot Could you try re-importing the dashboard from here to see if it helps?

rahulguptajss avatar May 03 '24 12:05 rahulguptajss

Hello rahul,

With this importation is working correctly.

image

@faguayot Could you try re-importing the dashboard from here to see if it helps?

faguayot avatar May 03 '24 12:05 faguayot

Hello rahul,

With this importation is working correctly.

image

@faguayot Could you try re-importing the dashboard from here to see if it helps?

Thank you, @faguayot. This dashboard issue will be addressed through #2873. Are there any other feedback regarding the panels?

rahulguptajss avatar May 06 '24 07:05 rahulguptajss

Verified in 24.05 release branch at commit 74dcf38a872d2c38e633664314e0acc5b9f8d76b

Tested in grafana 8.3.4, image

Tested in grafana 9.5.15, image

Tested in grafana 10.3.1, image

Tried to unselect/select cluster column in the table and no extra columns were visible with 1,2,etc during that path. working as expected.

Hardikl avatar May 15 '24 10:05 Hardikl

Hello rahul, With this importation is working correctly. image

@faguayot Could you try re-importing the dashboard from here to see if it helps?

Thank you, @faguayot. This dashboard issue will be addressed through #2873. Are there any other feedback regarding the panels?

Hello Rahul

I am trying the new release with these changes until now I haven't seen anything but if I find something I will let you know. Thanks for the implementation.

faguayot avatar May 29 '24 11:05 faguayot

Hello rahul, With this importation is working correctly. image

@faguayot Could you try re-importing the dashboard from here to see if it helps?

Thank you, @faguayot. This dashboard issue will be addressed through #2873. Are there any other feedback regarding the panels?

Hello Rahul

I am trying the new release with these changes until now I haven't seen anything but if I find something I will let you know. Thanks for the implementation.

Thanks @faguayot !

rahulguptajss avatar May 29 '24 12:05 rahulguptajss