clusterdata icon indicating copy to clipboard operation
clusterdata copied to clipboard

Likely bug in microservices 2021 traces (in MSRTQps)

Open mutinifni opened this issue 3 years ago • 1 comments

The HTTP_MCR and HTTP_RT metrics appear to have the exact same values, but for different indices.

Here's an example for MSRTQps_0.csv using Python pandas.

Data file used:

>>> import pandas as pd
>>> df = pd.read_csv("MSRTQps_0.csv")

Visual inspection (see the "value" column):

>>> df[df["metric"]=="HTTP_RT"].sort_values(by=["value"])
          Unnamed: 0  timestamp                                             msname                                       msinstanceid   metric         value
2507931      2507931     360000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  1241e56851129f2832769f898f4ae42ebed8416c5c4476...  HTTP_RT      0.010833
423481        423481     120000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  329f1b8f8580c79eaca3420cb81bae4e74caf2a715f633...  HTTP_RT      0.012962
13906180    13906180    1740000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  2f4b9b40b5a24ba9b3f0be87f6c674799cccf1f3f2d9f6...  HTTP_RT      0.013814
15452257    15452257    1020000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  425f1aae309527bc491a79f91a315fc5a8a515d1698605...  HTTP_RT      0.013995
831144        831144     300000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  65caf889c5079f53deecaee74542ce793b02669c4243ed...  HTTP_RT      0.014112
...              ...        ...                                                ...                                                ...      ...           ...
258159        258159    1560000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  fa7057b1ee80ed348191db81bc2a4c415a023dd9d0b966...  HTTP_RT  60581.531568
9532330      9532330     180000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  b1950cb1269c9b1ea7db2e0a75ba2e25859a2822539aa1...  HTTP_RT  62518.301342
9535894      9535894    1380000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  f053370b71a161fe6f14ef1447e16f87d272a1a1741f01...  HTTP_RT  62879.484608
562767        562767    1500000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  944ca9882cbcdc712c5ff9ece46ad2f316a2cbb3f9f3c1...  HTTP_RT  65206.585728
9196703      9196703     840000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  9a6d5b9e6e74fb54f287de03966275924a824f4462ba2d...  HTTP_RT  94560.483449

[674161 rows x 6 columns]
>>> df[df["metric"]=="HTTP_MCR"].sort_values(by=["value"])
          Unnamed: 0  timestamp                                             msname                                       msinstanceid    metric         value
2507940      2507940     360000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  1241e56851129f2832769f898f4ae42ebed8416c5c4476...  HTTP_MCR      0.010833
423651        423651     120000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  329f1b8f8580c79eaca3420cb81bae4e74caf2a715f633...  HTTP_MCR      0.012962
13906160    13906160    1740000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  2f4b9b40b5a24ba9b3f0be87f6c674799cccf1f3f2d9f6...  HTTP_MCR      0.013814
15452177    15452177    1020000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  425f1aae309527bc491a79f91a315fc5a8a515d1698605...  HTTP_MCR      0.013995
831091        831091     300000  2abd05990aa9eb81a3eaa333829ec355eaee57fbecc9e8...  65caf889c5079f53deecaee74542ce793b02669c4243ed...  HTTP_MCR      0.014112
...              ...        ...                                                ...                                                ...       ...           ...
258177        258177    1560000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  fa7057b1ee80ed348191db81bc2a4c415a023dd9d0b966...  HTTP_MCR  60581.531568
9532340      9532340     180000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  b1950cb1269c9b1ea7db2e0a75ba2e25859a2822539aa1...  HTTP_MCR  62518.301342
9535905      9535905    1380000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  f053370b71a161fe6f14ef1447e16f87d272a1a1741f01...  HTTP_MCR  62879.484608
562739        562739    1500000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  944ca9882cbcdc712c5ff9ece46ad2f316a2cbb3f9f3c1...  HTTP_MCR  65206.585728
9196731      9196731     840000  d17f58c4324b523992e8f479804b65a5b965f94877139b...  9a6d5b9e6e74fb54f287de03966275924a824f4462ba2d...  HTTP_MCR  94560.483449

[674161 rows x 6 columns]
>>>

Equality check:

>>> rt_vals=df[df["metric"]=="HTTP_RT"]["value"].sort_values().reset_index().drop("index",axis=1)
>>> mcr_vals=df[df["metric"]=="HTTP_MCR"]["value"].sort_values().reset_index().drop("index",axis=1)
>>> rt_vals.equals(mcr_vals)
True

This seems like a bug (the call rate and response time should certainly be different across so many functions). Or am I misunderstanding something about the trace?

Thanks!

mutinifni avatar Jan 31 '22 07:01 mutinifni

Thanks for your interest and questions!

I confirmed that the values of HTTP_MCR and HTTP_RT are the same, as you said. I will fix it and update the trace in the future.

niewuya avatar Feb 02 '22 11:02 niewuya