rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

protocols/gossipsub: Enable the `protobuf` feature of prometheus-client

Open ackintosh opened this issue 3 years ago • 6 comments

👷 This PR is a work in progress 👷

Description

The latest revision of prometheus-client has OpenMetrics protobuf support. Which will be released in v0.19.

Prior to the release, I'm working to enable the protobuf feature on gossipsub metrics.

Open Questions

Change checklist

  • [ ] I have performed a self-review of my own code
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] A changelog entry has been made in the appropriate crates

ackintosh avatar Sep 18 '22 23:09 ackintosh

Interesting, I wasn't aware that libp2p-gossipsub has its own metrics built-in. I think all other protocols collect their metrics by inspecting events emitted from the behaviour.

I guess for libp2p-gossipsub, those metrics are a lot more fine granular and couldn't be gathered through emitted events? I think this is fine but I have two questions:

  • Why isn't prometheus-client an optional dependency in libp2p-gossipsub? Metrics are optional in the rest of rust-libp2p.
  • Why would we need to enable protobuf support here? Isn't the encoding of metrics entirely an application decision that is irrelevant from how they are collected?

thomaseizinger avatar Sep 19 '22 05:09 thomaseizinger

metrics were added some good time ago to analyze mesh health and scoring behaviour. Back then we discussed using events as the rest of the protocols but the amount of new events that we would need for this make no sense in general.

About being optional, I don't remember why we went this route instead of a compile flag. We should probably revisit that option but I also think that is outside the scope of this PR

divagant-martian avatar Sep 21 '22 00:09 divagant-martian

metrics were added some good time ago to analyze mesh health and scoring behaviour. Back then we discussed using events as the rest of the protocols but the amount of new events that we would need for this make no sense in general.

About being optional, I don't remember why we went this route instead of a compile flag. We should probably revisit that option but I also think that is outside the scope of this PR

Okay, thanks for the explanation :)

I still don't understand why we need this PR. The protobuf feature of prometheus-client is about encoding metrics, correct? Unless I am missing something, that needs to be done in the application using its registry. If an application wishes to encode their metrics in protobuf format, they can do that by enabling the feature in their manifest. Enabling the feature here will just lead to dependency bloat for users that don't use the protobuf feature.

thomaseizinger avatar Sep 21 '22 01:09 thomaseizinger

I've opened an issue about feature-gating the metrics support: https://github.com/libp2p/rust-libp2p/issues/2923

thomaseizinger avatar Sep 21 '22 01:09 thomaseizinger

I still don't understand why we need this PR

Because the default Registry that prometheus exposes has a generic that uses the default text encoder. There is no way right now to use the new proto encoder using gossipsub. With this I mean, gossipsub as is does not use a registry generic over the encoder, but with a fixed one (text encoding)

I've edited the paragraph above way too many times, I hope I manage to communicate the idea

Of course if you think / find a better option we should explore it

divagant-martian avatar Sep 21 '22 03:09 divagant-martian

I still don't understand why we need this PR

Because the default Registry that prometheus exposes has a generic that uses the default text encoder. There is no way right now to use the new proto encoder using gossipsub. With this I mean, gossipsub as is does not use a registry generic over the encoder, but with a fixed one (text encoding)

I've edited the paragraph above way too many times, I hope I manage to communicate the idea

Of course if you think / find a better option we should explore it

Okay that makes sense, thanks for explaining :)

I hacked something together here: https://github.com/libp2p/rust-libp2p/tree/hack-templated-metric-gossip-sub Note that it depends on a fork of prometheus-client: https://github.com/thomaseizinger/client_rust/tree/hack-into-metric

It doesn't compile because the Encode derive macro seems to be broken, so I couldn't full verify that it works but I think this approach should work. The API of libp2p-gossipsub could obviously be polished to not have as much duplicated but that was out of scope for my PoC :)

error[E0034]: multiple applicable items in scope
  --> protocols/gossipsub/src/topic.rs:64:62
   |
64 | #[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd, Ord, Encode)]
   |                                                              ^^^^^^ multiple `encode` found
   |
   = note: candidate #1 is defined in an impl of the trait `EncodeLabels` for the type `String`
   = note: candidate #2 is defined in an impl of the trait `prost::Message` for the type `String`
   = note: this error originates in the derive macro `Encode` (in Nightly builds, run with -Z macro-backtrace for more info)
help: disambiguate the associated function for candidate #1
   |
64 | #[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd, Ord, EncodeLabels::encode(&Encode, Encode))]
   |                                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
help: disambiguate the associated function for candidate #2
   |
64 | #[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd, Ord, prost::Message::encode(&Encode, Encode))]
   |                                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

thomaseizinger avatar Sep 21 '22 05:09 thomaseizinger

Because the default Registry that prometheus exposes has a generic that uses the default text encoder. There is no way right now to use the new proto encoder using gossipsub. With this I mean, gossipsub as is does not use a registry generic over the encoder, but with a fixed one (text encoding)

This is unfortunate and very much my fault. I think we should move away from the generic type parameter on Registry and always use dynamic dispatching via a dyn XXX instead where XXX does both text and protobuf encoding. See https://github.com/prometheus/client_rust/pull/82#discussion_r956892586.

mxinden avatar Sep 27 '22 13:09 mxinden

It doesn't compile because the Encode derive macro seems to be broken,

That was my fault when I added the protobuf encoding to prometheus-client. Now that has been fixed on master branch via https://github.com/prometheus/client_rust/pull/93.

ackintosh avatar Sep 28 '22 13:09 ackintosh

It doesn't compile because the Encode derive macro seems to be broken,

That was my fault when I added the protobuf encoding to prometheus-client. Now that has been fixed on master branch via prometheus/client_rust#93.

Okay cool! That should allow my PoC to compile. I don't plan on working on this any time soon but feel free to take whatever you find useful :)

thomaseizinger avatar Sep 28 '22 14:09 thomaseizinger

I have copied the hack as is on my fork of prometheus-client based on the latest master branch which is fixed the derive macro bug: https://github.com/ackintosh/client_rust/tree/into-metric

Now I'm facing the compile error in prometheus-client:

$ cargo build --all-features
   Compiling prometheus-client v0.19.0 (/Users/akihito/src/github.com/ackintosh/client_rust)
error[E0119]: conflicting implementations of trait `registry::IntoMetric`
   --> src/encoding/text.rs:430:1
    |
430 | impl<M> IntoMetric for M where M: EncodeMetric + Send + Sync + 'static {
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation
    |
   ::: src/encoding/proto.rs:126:1
    |
126 | impl<M> IntoMetric for M where M: EncodeMetric + Send + Sync + 'static {
    | ---------------------------------------------------------------------- first implementation here

It looks no conflict for me because the two EncodeMetric should be resolved as the text one and the proto one each... Maybe a lack of my understanding. 🤔

ackintosh avatar Sep 28 '22 15:09 ackintosh

@thomaseizinger @mxinden By the way, here is another attempt to allow the two (text and proto) encoder by Diva: https://github.com/ackintosh/rust-libp2p/pull/55

What do you think?

ackintosh avatar Sep 28 '22 15:09 ackintosh

I have copied the hack as is on my fork of prometheus-client based on the latest master branch which is fixed the derive macro bug: ackintosh/client_rust@into-metric

Now I'm facing the compile error in prometheus-client:

$ cargo build --all-features
   Compiling prometheus-client v0.19.0 (/Users/akihito/src/github.com/ackintosh/client_rust)
error[E0119]: conflicting implementations of trait `registry::IntoMetric`
   --> src/encoding/text.rs:430:1
    |
430 | impl<M> IntoMetric for M where M: EncodeMetric + Send + Sync + 'static {
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation
    |
   ::: src/encoding/proto.rs:126:1
    |
126 | impl<M> IntoMetric for M where M: EncodeMetric + Send + Sync + 'static {
    | ---------------------------------------------------------------------- first implementation here

It looks no conflict for me because the two EncodeMetric should be resolved as the text one and the proto one each... Maybe a lack of my understanding. thinking

Ah damn. It is conflicting because you could have a type that implements both traits and then it is no longer clear, which implementation to chose. One would need negative traits bounds to express this.

thomaseizinger avatar Sep 29 '22 00:09 thomaseizinger

@thomaseizinger @mxinden By the way, here is another attempt to allow the two (text and proto) encoder by Diva: ackintosh#55

What do you think?

It seems like it would require each user of a prometheus-client to define all their metrics for each encoder. That sounds look a lot of duplication to me?

thomaseizinger avatar Sep 29 '22 00:09 thomaseizinger

I saw the PR: https://github.com/prometheus/client_rust/pull/100

ackintosh avatar Sep 30 '22 14:09 ackintosh

Oh yeah that was a quick dirty attempt to get things working. No investment on particular solutions. Tho the current situation is definitely somewhat annoying

divagant-martian avatar Sep 30 '22 14:09 divagant-martian

With https://github.com/prometheus/client_rust/pull/105 and the latest prometheus-client v0.19.0 release, this should be resolved. @ackintosh @divagant-martian please comment in case I am missing something.

mxinden avatar Jan 02 '23 16:01 mxinden