client_java Add MetricsFormatter to allow develops could collect metrics by their own way.

trafficstars

Add MetricsFormatter to allow develops collect metrics by their own way. For more context, you can see https://github.com/prometheus/client_java/pull/782

May 24 '22 06:05 tjiuming

@tjiuming has a test, for fun... Example initial version of a Collector ~~callback~~ visitor implementation...

~~https://github.com/dhoard/client_java/tree/WIP-collector-callback-support~~ https://github.com/dhoard/client_java/tree/WIP-collector-registry-visitor-support

While I like the ~~callback~~ visitor pattern, etc... I think the paradigm shift would be hard to get adopted.

Update: branch name

May 24 '22 12:05 dhoard

@dhoard Do you means that this PR would be hard to get adopted?
I believe it's an important feature, could you please give some suggestions that could make this PR get merged?

May 24 '22 14:05 tjiuming

@tjiuming I am struggling to understand what the feature is about. The description is just a reference to #782, so is this about a performance improvement?

May 24 '22 14:05 fstab

@dhoard Do you means that this PR would be hard to get adopted? I believe it's an important feature, could you please give some suggestions that could make this PR get merged?

This is just my opinion given the amount of work that I feel would be required to implement it correctly.

Implement your solution and provide benchmarks that prove your solution increases performance/decreases memory usage (GCs), etc. (which it may well do.)

May 24 '22 17:05 dhoard

@dhoard This PR is providing a facade, it's an extention. I think we don't need to care about custom implementation's performance

May 26 '22 13:05 tjiuming

@tjiuming just pushed some changes to my branch ~~https://github.com/dhoard/client_java/tree/WIP-collector-callback-support~~ https://github.com/dhoard/client_java/tree/WIP-collector-registry-visitor-support that simulates what you are trying to do (at least the core changes.)

Based on my profiling with YourKit... the performance isn't as good using a similar "TextFormat" class similar to the current one.

Update: branch name

May 27 '22 01:05 dhoard

@dhoard you can try https://github.com/prometheus/client_java/pull/782 (The code is too messy and needs to be refactored), there are some optimization methods.

Use unsafe to prevent generate string objects like doubleToString
Use netty's CompositeDirectByteBuf to prevent buffer expansions
When expose metrics, we can also use jetty and to prevent mem_copy by using it'sHttpOutput#write(java.nio.ByteBuffer)
When write extraLabelName and extraLabelValue, we can write(List<String> labelNames, String extraLabelName, List<String> labelValues, double extraLabelValue) to prevent new ArrayList()
In TextFormat#write004, there is a local var omFamilies, we can also write it to a subMetricWrite, and in the finally after all metrics collected, append subMetricWrite to mainMetricWrite's tail directly.

Each of the above methods may have only a little performance improvement, but it's very useful for these systems which has many meters.

May 27 '22 03:05 tjiuming

@tjiuming you need to close one of the PRs. It's not clear which branch of code you are actually working with.

May 27 '22 13:05 dhoard

@tjiuming you need to close one of the PRs. It's not clear which branch of code you are actually working with.

I closed that PR. Please help approve this PR.

May 29 '22 22:05 tjiuming

@tjiuming In my opinion (not an official maintainer)... You need to provide reproducible evidence that the PR provides value.

In my preliminary tests of the PR (ignoring the actual code to write the metrics... since this is what you have done in MettricsFormatterTest)...

The new collection method takes longer to collect metrics
GC count has increased
Average per GC time has decreased
Overall GC time (GC count * average GC time) has increased.

The stated goal of the original PR was to decrease the creation of Sample objects... but I don't see that as being an issue based on profiling.

The stated goal of this PR is to "allow developers to collect metrics in their own way." Given that the Enumeration<Collector.MetricFamilySamples> lazily collects metrics, I feel this can be accomplished without any changes.

May 31 '22 12:05 dhoard

@tjiuming I am struggling to understand what the feature is about. The description is just a reference to #782, so is this about a performance improvement?

I want to contribute the facade to prometheus and developers could implement it in their own projects.

Jun 07 '22 05:06 tjiuming

@dhoard @fstab I pushed some commits to the branch https://github.com/tjiuming/client_java/tree/dev/collect_performance, you can run io.prometheus.client.exporter.common.CollectPerformanceTest in your local and use jconsole to profile, heap-usages and CPU usages are different.

Jun 07 '22 05:06 tjiuming

@dhoard @fstab could you please help review?

Jun 14 '22 09:06 tjiuming

Hi @dhoard ,

First, thanks for taking the time to read and experiment with this PR.

I think I can contribute a bit about the motivation for this PR.

When an application which uses Prometheus Client library for defining and exposing metrics in Prometheus, has a very large number of times series (metric + labels) and by that I mean millions, it means every time you are scraping the /metrics endpoint, you'll be generating millions of Samples objects. This has an hefty price in terms of: memory allocations which eventually leads to more longer Garbage Collection cycles, meaning more CPU time dedicated to GC and the full stops GC cost more, meaning more latency.

Applications which only generates thousands of samples per scrape, wouldn't really care/notice. Those that are in the other end of the spectrum are heavily affected. One of those examples is Apache Pulsar - it is messaging system smilier to Apache Kafka. As opposed to Kafka, a broker can support up to 1 million topics, each with its own Producers and Consumers. Each topic has its own metrics. You can easily reach millions of samples per scrape. Since Pulsar is an infrastructure component, it has very sensitive to both memory, CPU and latency caused by GCs.

This PR attempts to take a stab in reducing the memory consumption and memory allocations by introducing an additional mechanism to collect the samples by using the visitor pattern.

A Prometheus Collector will have additional method for obtaining the samples:

void collect(SamplesVisitor samplesVisitor)

and SamplesVisitor would have a different method for visiting:

   counterStart(name, labels, description);
   counterSample(name, labels, value)
   histogramStart(name, labels, description);
   ...

This allows to pass the samples information without any memory allocations.

Apache Puslar in this example would implement its own SamplesVisitor. In our case we write to multiple ByteBuf (from Netty) which is located off-heap (thus no GC), and flush those in certain order to the Http OutputStream.

That's the gist of the idea.

I do believe we can create a DevNullSamplesObserver and run in on 1M and 5M samples scrape, and compare amount of memory allocated that is garage collectible, and also CPU time as comparison, but I'm not sure it's required for this case.

I haven't used the exact implementation names and design of @tjiuming , because I wanted to convey the idea first.

What do you think?

Jun 29 '22 15:06 asafm

@asafm Based on the initial description provided by @tjiuming, as well as your description, I am in alignment on the purpose of the changed.

My branch https://github.com/dhoard/client_java/tree/WIP-collector-registry-visitor-support contains my initial implementation of the change (as you described.) I personally like the approach

The 2 PRs (as presented), in my opinion, are disjoint (not clear/messy) and provide no basis (profiling, proof, testing, etc.) that it's an improvement.

My basic profiling of the change using YourKit (ignoring the actual Visitor implementation) doesn't show a remarkable improvement on my machine using a JUnit test (https://github.com/dhoard/client_java/blob/WIP-collector-registry-visitor-support/simpleclient/src/test/java/io/prometheus/client/CollectorRegistryTest2.java) - real-world usage would be better.

I'm not a maintainer, so other than my on interest/curiosity, my opinions are just that - one person's opinions.

@fstab @tomwilkie @brian-brazil and others would need to weigh in on this alternate approach to collect metrics after @tjiuming and you can provide the necessary proof it's a good valid solid change. This may/may not be a direction they want to go with the library - I can't speak for them.

Jun 30 '22 12:06 dhoard

My basic profiling of the change using YourKit (ignoring the actual Visitor implementation) doesn't show a remarkable improvement on my machine using a JUnit test

Java GC is pretty smart when it comes to very short lived garbage, I'm not surprised. Even then we're probably only talking a few hundred MB, which isn't much given the use case. Considering how often this is called, this sort of micro-optimisation is unlikely to make sense in broader terms when considered against the additional API complexity. The API we have already provides a generic API to allow for other output formats.

More generally a single target exposing millions of samples is far beyond what is typical, and will likely cause cardinality problems on the Prometheus end, if you can even manage to consistently scrape it successfully once per scrape interval.

I would suggest you look at solutions that don't involve trying to exposing per-topic information in systems with millions of topics.

Jun 30 '22 13:06 brian-brazil

@brian-brazil Thanks for taking the time to review this. I do agree this isn't the standard case. I do agree the API will be more complicated, by adding an additional method for collecting metrics.

On the other hand, on super sensitive to latency systems such as Apache Pulsar, or databases in general, a 3rd party library, especially observability one, should aim to inflict as little impact as possible - be a "fly on the wall" so to speak in the "room". In my opinion, that additional visitor-pattern based collect() method, can be used to achieve that, if you even change the way client writes the collected metrics, directly to the HTTP response stream, without going through that intermediate objects. So the "default" implementation will have less impact on the application side.

We're definitely looking at applicative ways to overcome the 1M unique time series issues.

I understand if you decide to leave things as is.

Jul 04 '22 10:07 asafm

if you even change the way client writes the collected metrics, directly to the HTTP response stream, without going through that intermediate objects.

We have to buffer the output at least, otherwise we don't know if an error occurs during collection which requires a HTTP 500 to be returned.

Jul 04 '22 11:07 brian-brazil

I'm not familiar enough with the client - what can cause an error when iterating over existing collectors? It's all under client control no?

Jul 04 '22 11:07 asafm

A collector could throw an exception, or in future there may be additional checking that the output would be valid.

Jul 04 '22 11:07 brian-brazil

@brian-brazil I've read the code a few times now, and I can't find where the http server catches an exception during collection and return 500. I couldn't find any buffering. I saw that each time we get a collector, we ask for its samples and iterate over it, writing it one by one to the stream.

Could you please guide me to that location?

Jul 04 '22 11:07 asafm

It's not code that we explicitly have, the HTTP servers do it for us.

You can see output buffering in https://github.com/prometheus/client_java/blob/master/simpleclient_httpserver/src/main/java/io/prometheus/client/exporter/HTTPServer.java#L86 for example.

Jul 04 '22 11:07 brian-brazil

I searched through Sun Http Server and HTTP Server and couldn't find anything that catches an exception and sends 500. So I took TestHTTPServer.testSimpleRequest() and modified it a bit:

  @Test
  public void testSimpleRequest() throws IOException {
    HTTPServer httpServer = new HTTPServer(new InetSocketAddress(0), registry);

    Collector collector = new Collector() {
      @Override
      public List<MetricFamilySamples> collect() {
        throw new RuntimeException("Something went wrong during collect");
      }
    };

    registry.register(collector);
    try {
      HttpResponse response = createHttpRequestBuilder(httpServer, "/metrics").build().execute();
      System.out.println("Status code = " + response.getResponseCode());
    } finally {
      httpServer.close();
    }
  }

The result is that connection is dead in execute, since it reaches:

          } catch (Exception e4) {
                logger.log (Level.TRACE, "ServerImpl.Exchange (4)", e4);
                closeConnection(connection);
            } catch (Throwable t) {
                logger.log(Level.TRACE, "ServerImpl.Exchange (5)", t);
                throw t;
            }

in ServerImpl

Regarding the buffering - thanks for the reference, I somehow missed that.

Jul 04 '22 14:07 asafm

@asafm You are correct - a 500 Internal Server Error response is not sent if an Exception occurs during collection.

The HTTPServer code is designed/implemented such that a collection has to be 100% successful for a response (hence the buffering.)

For a failed collection, there are two approaches to handle the Exception:

return no response/close the connection.
return a 500 Internal Server Error response.

The code currently implements approach 1.

There are arguments for both approaches... but ideally, in my opinion, we should at least catch/log the Exception.

Jul 04 '22 15:07 dhoard

Thanks for taking the time to verify @dhoard.

I do agree that if the library already has buffering in place, it makes sense to try-catch and throw 500 in case the collection failed.

One idea that can be used, in tandem with what I suggested or not, is ByteBuffer-backed OutputStream. That ByteBuffer will be allocated directly meaning it's off-heap which removes any need for garbage collection. I found this as an example.

Then using the new Visitor Pattern, the default HTTP server can use it and write directly off-heap, without any additional objects allocated, thereby minimizing the impact on the application this library is running at.

Jul 05 '22 08:07 asafm

@asafm the code uses thread-local ByteArrayOutputStream when writing results:

https://github.com/prometheus/client_java/blob/4ddd60c3d161272c6a9434c5f1ac519eab964fb9/simpleclient_httpserver/src/main/java/io/prometheus/client/exporter/HTTPServer.java#L89-L90

This eliminates the GC issue around the buffer. (conversation on the ByteArrayOutputStream memory usage https://github.com/prometheus/client_java/issues/703)

Moving to a ByteBufferOutputStream would move the memory off-heap, but shouldn't be a performance improvement.

Jul 05 '22 12:07 dhoard

Interesting. Didn't think of those implications - having the same buffer 5 times, although you only need one does seem not so good. I agree with you - it does eliminate the GC issue. Moving to ByteBuffer wouldn't improve performance, but it will save up on memory consumption (5 unused arrays --> 1 array off-heap).

Circling back to the original issue - the visitor pattern can reduce the burden on memory allocations, with a bit of added complexity. As I said previously I believe an observability library should aim to minimize its impact on a running application as much as possible. I think the price to pay is quite small to get one more footing towards that goal.

Combine it with moving to ByteBufferOutputStream, and you reduce both heap memory consumption and reduce the garbage collection work with the visitor pattern. Both serve to minimize the impact on the hosting application.

Jul 05 '22 13:07 asafm

@asafm As I pointed out earlier, I like the visitor pattern, which is why I am in the conversation/provided feedback.

Adoption/inclusion is up to the maintainers. (I'm not a maintainer.)

Jul 05 '22 14:07 dhoard

Ok, I'll wait to see what @brian-brazil thinks about all of this conversation.

Jul 07 '22 08:07 asafm

Hi, first of all, thanks a lot for the discussion, I really appreciate everyone is taking the time to look into this and to find a good solution.

Few remarks on points that were mentioned above but aren't directly related to the PR:

having the same buffer 5 times

We start only 1 thread by default, so there is only one thread local buffer (see executor.setCorePoolSize(1)). The number of threads will increase up to 5 if there are multiple scrapes in parallel, but as long as scrapes are sequential there should be only a single thread and only a single buffer.

couldn't find anything that catches an exception and sends 500

Yes, your analysis is correct. I just want to add that closing the connection without sending an HTTP response will make the Prometheus server assume a failed scrape, and the up metric will be 0. So the Prometheus server handles this in exactly the same way as a 500 response.

catch/log the Exception

Technically it is caught and logged in the HttpServer, so if you configure java.util.logging with a ConsoleHandler and set the log level for "com.sun.net.httpserver" to ALL you'll see it. I totally get that this is not convenient. However I'm also hesitant to include a logging library as a dependency in client_java. This will almost certainly conflict with the logging libraries in the applications that use client_java. To avoid potential conflicts it might be better to just throw the exception and not introduce our own logging in client_java.

Anyway, here are some comments on the actual PR:

I understand that this is about a performance improvement. However, the PR does not implement the actual performance improvement, but provides API that allows users to implement their own optimized collect() methods. Is that correct? I'm wondering why you don't provide the performance improvement itself, so that everyone can benefit from it.
Performance improvements should always be verified with experiments. Please implement some benchmarks using jmh. You can find some examples in benchmarks. Please run the benchmarks with the current implementation and the new implementation and post the results here, so that we can compare the numbers and discuss whether the performance gain is worth the additional complexity.

Jul 09 '22 21:07 fstab

client_java client_java copied to clipboard

Add MetricsFormatter to allow develops could collect metrics by their own way.

client_java
client_java copied to clipboard