Add timing information for acquiring an existing connection
Currently, client-side provides some timing information when a new connection is created. https://armeria.dev/docs/advanced-structured-logging#client-connection-timing-properties
This information is exposed with the naming connectionAcquisitionDurationNanos.
However, these metrics are recorded only for cases where a new connection is created.
It has been pointed out that the naming is confusing since this metric isn't recorded when an existing connection is used. We may do one of the following:
- Add a new timing to
ClientConnectionTimings
- Add a connection timing info
- e.g.
existingConnectionAcquisitionDurationNanos
- e.g.
- Measure the time to query from the connection pool (https://github.com/line/armeria/blob/b78d9515dc8b4b997dd65787f717443b38a21869/core/src/main/java/com/linecorp/armeria/client/HttpClientDelegate.java#L170-L174))
- Note that whether the acquisition is successful or not, the duration will be recorded.
- Reuse
pendingAcquisitionDurationNanos
- We don't have to worry about naming, but the logic to whether start/end the metric may be complex.
- Don't define a separate timing and just end the timings
- This may be achieved by simply passing the built timings
- https://github.com/line/armeria/blob/b78d9515dc8b4b997dd65787f717443b38a21869/core/src/main/java/com/linecorp/armeria/client/HttpClientDelegate.java#L176
I propose we go with option 1 might be simpest/least prone to error.
I think we also want to expose this metric to RequestMetricSupport by default.
Note: As pointed out by @ikhoon, doing this would also fix a bug where dns timing information wasn't collected if there is a healthy connection already available.
ref: https://line-armeria.slack.com/archives/C1NGPBUH2/p1684469502484689
Thanks @jrhee17 for creating this issue!
I agree that option 1 is the most reasonable option. Here are some of my suggestions on option (and please FFTI):
existingConnectionAcquisitionDurationNanosis a good name. I'd also suggest renamingconnectionAcquisitionDurationNanostonewConnectionAcquisitionDurationNanos. This way, the name is the most clear.- I understand that this^ renaming could be a breaking API change as
ClientConnectionTimingsis probably considered as a public API. So, maybe consider this renaming as part of the next major release. But before that happens, it'd be very nice to update all Javadocs/comments aboutconnectionAcquisitionDurationNanosto clearly point out that this duration is only about acquiring/creating a NEW connection.
I'd also suggest renaming connectionAcquisitionDurationNanos to newConnectionAcquisitionDurationNanos. This way, the name is the most clear.
Rather, I was thinking that existingConnectionAcquisitionDurationNanos would be included in connectionAcquisitionDurationNanos.
- If a readily existing connection exists,
existingConnectionAcquisitionDurationNanos > 0 - If a connection is in the progress of creation,
pendingAcquisitionDurationNanos > 0 - If a connection is created,
socketConnectDurationNanos > 0
Essentially, connectionAcquisitionDurationNanos = existingConnectionAcquisitionDurationNanos + pendingAcquisitionDurationNanos + socketConnectDurationNanos so semantically I was thinking the naming is fine as-is.
Note: @ikhoon also suggested that we change the naming of the metric collected in RequestMetricSupport:
https://github.com/line/armeria/blob/f2116739018c15f230dd8f0e5a4dd5db1fefc888/core/src/main/java/com/linecorp/armeria/internal/common/metric/RequestMetricSupport.java#L289-L296
He suggested that we use tags to differentiate connection acquisition
e.g.
connection.acquisition.duration
acquisition_typenewpiggyback(pendingmight be more appropriate)reuseorexisting
To avoid breaking changes, we may just add another metric at the cost of extra metric exported. (and deprecate the old metric. Deprecation may be propagated via javadocs and documentation to be added)
We should also update documentation since the relationship between values in ConnectionTimings and the metric names may not be clear.
If the default metrics are more accurately expressed/exposed by RequestMetricsSupport, I think it will be possible to reduce the hassle of creating custom metrics by accessing ClientConnectionTimings.
The new metric will provide more accurate information about connection acquisition.