sat icon indicating copy to clipboard operation
sat copied to clipboard

feat(sat): add push metrics support

Open javorszky opened this issue 2 years ago • 1 comments

Closes #115

Enables push metrics on sat to otlp collector running locally.

Here are the parts needed:

Framework / structure / scaffolding

This is the scaffolding to enable collection of data and shipping them somewhere at specified times. This part does not actually configure the meters.

exporter

This part configures the actual exporter - the thing that sends available data to wherever we point it. In this case the exporter sends the data to an otlp connector that it can reach via the connection we configured earlier. Other configuration here are retries, timeouts, etc.

The interesting / confusing thing is that you can configure the connection through the exporter (as done here), or as a separate connection, and pass that in with otlpmetricgrpc.WithGRPCConn(conn).

controller

This is the thing that will actually facilitate the polling / collecting of the data, and then send it to the exporter every

provider

Lastly the controller is then set as the global metrics provider. It is global, because that's how most other metrics solutions work anyways.

The actual meters

Configured in metrics.ConfigureMetrics. This part very much requires that the global meter provider is already up and running.

There are four different types of meters that we can choose from: sync / async and integer / float. This choice would impact every single meter, though nothing is stopping us from using different types.

The following three meters are instantiated, hung up onto a struct, and then returned and stored on the Sat struct, so consuming code can access it.

function executions

This is a unitless counter. Every time sat's handle method is called, this is incremented by one, regardless of the outcome of the function call.

failed function executions

Unitless counter. Every time sat's handle returns an error, this is incremented by one. This strictly needs to be less than or equal to the number of function executions.

function time

A timer is started before we start function execution, and then observation happens after it. This is a histogram meter with a unit of milliseconds.

There is a metrics.Timer struct which upon instantiation (NewTimer) saves the current time, and then passing the histogram into its Observe method will diff the two times (current at Observe - current at instantiation), and save that to the histogram meter in Milliseconds.

potential refactors for the future

  • the grpc connection used for tracing and metrics could potentially be shared

known limitations at the time of writing this message

  • ~~if the collector is not running, there is no expo fallback with retries option, sat will refuse to start up~~ not actually the case
  • ~~there is no way to configure a noop metrics at the moment, though that's a fairly easy addition~~ is now solved

How to test / enable this?

  1. check out this branch
  2. make sure you have the following environment variables set on your host computer (or edit the docker compose file directly)
    • ATMO_TRACER_HONEYCOMB_API_KEY: that's your honeycomb api key that you can see in the Suborbital team settings page
    • ATMO_TRACER_HONEYCOMB_API_ENDPOINT: value should be api.honeycomb.io:443
    • ATMO_TRACER_HONEYCOMB_DATASET: set this to any particular string. I use gabor-breaks-things. This needs to be set, otherwise honeycomb is going to complain (but still retain the data in a bucket called unknown-dataset)
  3. launch the collector with docker compose up collector
  4. build sat with make sat
  5. run sat with make runlocal. This will start sat up on a random port. The logs will tell you which port it is. It will look like this:
    {"log_message":"(I) serving on :1268","timestamp":"2022-05-31T18:56:18.048109+01:00","level":3,"app":{"sat_version":"v0.1.4"}}
    
  6. Open up postman or your favourite tool to send POST requests, and send a bunch of POST to localhost:1268 (in my case) with some request body
    1. there's a shorthand make target that needs hey installed locally: make bombard PORT=1268 where 1268 is whatever port sat is running on. That will send 10k requests at sat.
  7. The file at ./traces/traces.json should now have entries about function executions and function timings.

javorszky avatar May 31 '22 18:05 javorszky

Hell yeah, this is working nicely!

Extracted metrics and tracing into go-kit, and wired it back in. Only sat-specific code remains in sat, general code (salutes) is now in gokit.

javorszky avatar Jun 07 '22 13:06 javorszky