prometheus icon indicating copy to clipboard operation
prometheus copied to clipboard

PromQL parser mistakingly interpreting UTF8 escape character

Open brancz opened this issue 2 years ago • 15 comments

What did you do?

We use the PromQL parser in the Parca project. We noticed something interesting when label values contained the \x2d value. Ultimately we noticed that the PromQL parser interprets this, not as the string value, but rather interprets the UTF-8 escape character, which in this case is the dash (-).

A resulting inconsistency can be triggered using the following config:

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
        labels:
          faulty_label: \x2d

Then query Prometheus, for example using the prometheus_build_info metric.

Screenshot 2022-11-25 at 17 18 20

If we then use the UI to copy the label-matcher, and insert it into the query and query, however, we get an empty result.

Screenshot 2022-11-25 at 17 23 38

Only if we explicitly escape the label value do we get the result we were looking for with the previous query:

Screenshot 2022-11-25 at 17 24 17

What did you expect to see?

I expected one of the following two, for Prometheus' behavior to be consistent within itself:

  • The PromQL parser retains the actual string literal and does not interpret the UTF-8 escape character.
  • Table renders the label-value in its escaped form, the autocompletion suggests the label-value in its escaped form, and therefore copying the matcher by clicking on it in the table also uses the escaped form.

What did you see instead? Under which circumstances?

Described in the first section.

System information

Darwin 21.6.0 arm64

Prometheus version

$ ./prometheus --version
prometheus, version 2.40.3 (branch: HEAD, revision: 84e95d8cbc51b89f1a69b25dd239cae2a44cb6c1)
  build user:       root@692b092db288
  build date:       20221124-09:15:54
  go version:       go1.19.3
  platform:         darwin/arm64

Prometheus configuration file

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
        labels:
          faulty_label: \x2d

brancz avatar Nov 25 '22 16:11 brancz

I don’t think we’ve established yet whether the parser is faulty or the tooling around query completion is behaving inconsistently. So we don’t really know yet how we would want to fix it.

brancz avatar Nov 25 '22 17:11 brancz

cc @roidelapluie @juliusv

brancz avatar Nov 25 '22 17:11 brancz

I think in the UI we can't do {a=~"1\.2"}. Is it the same bug?

roidelapluie avatar Nov 25 '22 18:11 roidelapluie

The issue is really just that the table output is really dumb and doesn't do any escaping (besides automatic HTML safety stuff): https://github.com/prometheus/prometheus/blob/dfa5cd55db548b8f1355b5aabb7cf869491272a2/web/ui/react-app/src/pages/graph/SeriesName.tsx#L37

The rest seems to work fine.

juliusv avatar Nov 25 '22 18:11 juliusv

Screenshot_20221125-191143~2

roidelapluie avatar Nov 25 '22 18:11 roidelapluie

@roidelapluie That's a different issue. Double-quoted strings in PromQL interpret escape sequences starting with \, and \. is not a valid Go string escape sequence. Try backticks for no interpretation of escape sequences within the string instead:

{a=~`1\.2`}

juliusv avatar Nov 25 '22 18:11 juliusv

The issue is really just that the table output is really dumb and doesn't do any escaping (besides automatic HTML safety stuff):

It's not just the table though, when you autocomplete and select the suggested value I think the behavior is unexpected. Quick screen recording to demonstrate what I mean:

https://user-images.githubusercontent.com/4546722/204038170-011c5f6e-c8ca-4e39-b53b-9a0d2797cc58.mov

brancz avatar Nov 25 '22 18:11 brancz

@brancz Hm yeah. Optimally the autocompletion would be able to understand the string context it's in (double-quoted string vs. backtick-quoted) and escape any inserted label values accordingly, if necessary.

juliusv avatar Nov 25 '22 18:11 juliusv

@roidelapluie Btw., regarding our string interpretation: https://prometheus.io/docs/prometheus/latest/querying/basics/#string-literals

juliusv avatar Nov 25 '22 18:11 juliusv

@brancz Hm yeah. Optimally the autocompletion would be able to understand the string context it's in (double-quoted string vs. backtick-quoted) and escape any inserted label values accordingly, if necessary.

Yeah agreed, that's option 2 that I suggested tl;dr the parser is behaving as expected, but the query builder/tooling is not quite.

brancz avatar Nov 25 '22 18:11 brancz

Yep, both the autocompletion and the table output is broken, but the latter is more blatantly broken :P

If anyone wants to pick this up before I or Augustin get to it, basically you'd need to escape label values according to Go string escaping rules in the table output in both the formatted variant (https://github.com/prometheus/prometheus/blob/dfa5cd55db548b8f1355b5aabb7cf869491272a2/web/ui/react-app/src/pages/graph/SeriesName.tsx#L37) and the unformatted (with many series) variant (https://github.com/prometheus/prometheus/blob/dfa5cd55db548b8f1355b5aabb7cf869491272a2/web/ui/react-app/src/utils/index.ts#L37). The formatted variant of this is also used for the legend in the graph view.

juliusv avatar Nov 25 '22 18:11 juliusv

hey, I'd like to work on this! :cowboy_hat_face:

exitflynn avatar Nov 25 '22 19:11 exitflynn

@exitflynn Yay! Go ahead :)

juliusv avatar Nov 25 '22 19:11 juliusv