sat
sat copied to clipboard
feat(sat): add push metrics support
Closes #115
Enables push metrics on sat to otlp collector running locally.
Here are the parts needed:
Framework / structure / scaffolding
This is the scaffolding to enable collection of data and shipping them somewhere at specified times. This part does not actually configure the meters.
exporter
This part configures the actual exporter - the thing that sends available data to wherever we point it. In this case the exporter sends the data to an otlp connector that it can reach via the connection we configured earlier. Other configuration here are retries, timeouts, etc.
The interesting / confusing thing is that you can configure the connection through the exporter (as done here), or as a separate connection, and pass that in with otlpmetricgrpc.WithGRPCConn(conn)
.
controller
This is the thing that will actually facilitate the polling / collecting of the data, and then send it to the exporter every
provider
Lastly the controller is then set as the global metrics provider. It is global, because that's how most other metrics solutions work anyways.
The actual meters
Configured in metrics.ConfigureMetrics
. This part very much requires that the global meter provider is already up and running.
There are four different types of meters that we can choose from: sync / async and integer / float. This choice would impact every single meter, though nothing is stopping us from using different types.
The following three meters are instantiated, hung up onto a struct, and then returned and stored on the Sat
struct, so consuming code can access it.
function executions
This is a unitless counter. Every time sat's handle
method is called, this is incremented by one, regardless of the outcome of the function call.
failed function executions
Unitless counter. Every time sat's handle
returns an error, this is incremented by one. This strictly needs to be less than or equal to the number of function executions.
function time
A timer is started before we start function execution, and then observation happens after it. This is a histogram meter with a unit of milliseconds.
There is a metrics.Timer
struct which upon instantiation (NewTimer
) saves the current time, and then passing the histogram into its Observe
method will diff the two times (current at Observe - current at instantiation), and save that to the histogram meter in Milliseconds.
potential refactors for the future
- the grpc connection used for tracing and metrics could potentially be shared
known limitations at the time of writing this message
- ~~if the collector is not running, there is no expo fallback with retries option, sat will refuse to start up~~ not actually the case
- ~~there is no way to configure a noop metrics at the moment, though that's a fairly easy addition~~ is now solved
How to test / enable this?
- check out this branch
- make sure you have the following environment variables set on your host computer (or edit the docker compose file directly)
-
ATMO_TRACER_HONEYCOMB_API_KEY
: that's your honeycomb api key that you can see in the Suborbital team settings page -
ATMO_TRACER_HONEYCOMB_API_ENDPOINT
: value should beapi.honeycomb.io:443
-
ATMO_TRACER_HONEYCOMB_DATASET
: set this to any particular string. I usegabor-breaks-things
. This needs to be set, otherwise honeycomb is going to complain (but still retain the data in a bucket called unknown-dataset)
-
- launch the collector with
docker compose up collector
- build sat with
make sat
- run sat with
make runlocal
. This will start sat up on a random port. The logs will tell you which port it is. It will look like this:{"log_message":"(I) serving on :1268","timestamp":"2022-05-31T18:56:18.048109+01:00","level":3,"app":{"sat_version":"v0.1.4"}}
- Open up postman or your favourite tool to send POST requests, and send a bunch of POST to localhost:1268 (in my case) with some request body
- there's a shorthand make target that needs
hey
installed locally:make bombard PORT=1268
where 1268 is whatever port sat is running on. That will send 10k requests at sat.
- there's a shorthand make target that needs
- The file at
./traces/traces.json
should now have entries about function executions and function timings.
Hell yeah, this is working nicely!
Extracted metrics and tracing into go-kit, and wired it back in. Only sat-specific code remains in sat, general code (salutes) is now in gokit.