excpetion thrown when exporting TimeSeries to Stackdriver after every hour
Please answer these questions before submitting a bug report.
What version of OpenCensus are you using?
0.21.0
What JVM are you using (java -version)?
java 10
What did you do?
collecting metrics using openCensus. The export interval set is 2 mins. Following is the implementation:
static Aggregation DSAppDistribution = Aggregation.Distribution.create(
BucketBoundaries.create(Arrays.asList(0.0, 2.0, 4.0, 6.0, 8.0, 10.0, 20.0, 30.0, 40.0, 50.0,
100.0, 200.0, 300.0, 400.0, 500.0, 1000.0, 1500.0, 2000.0, 2500.0, 3000.0, 3500.0, 4000.0,
4500.0, 5000.0, 10000.0, 15000.0, 20000.0, 25000.0, 30000.0, 40000.0, 50000.0, 60000.0)));
static Aggregation AppDistribution =
Aggregation.Distribution.create(BucketBoundaries.create(Arrays.asList(0.0, 100.0, 200.0,
300.0, 400.0, 500.0, 600.0, 700.0, 800.0, 900.0, 1000.0, 1200.0, 1500.0, 2000.0, 2500.0,
3000.0, 3500.0, 4000.0, 5000.0, 6000.0, 8000.0, 10000.0, 15000.0, 20000.0, 25000.0)));
// Measures
public static final Measure.MeasureLong DS_APP_LATENCY = Measure.MeasureLong
.create("service/downstream_app-latency", "The latency in milliseconds", "ms");
public static final Measure.MeasureLong APP_LATENCY = Measure.MeasureLong
.create("service/app-latency", "The latency in microseconds", "us");
public static final Measure.MeasureLong INBOUND_COUNT = Measure.MeasureLong
.create("service/app-request-count", "Aggregated sum of request count", "1");
private static final Integer interval = 120
private void setUpMetricExporter() throws IOException {
registerAllViews();
StackdriverStatsExporter
.createAndRegister(StackdriverStatsConfiguration.builder().setProjectId(projectId)
.setExportInterval(Duration.create(interval.longValue(), 0)).build());
}
private void registerAllViews() {
// Define the count aggregation
Aggregation countAggregation = Aggregation.Count.create();
List<TagKey> tagKeyAppMetricsList =
Collections.unmodifiableList(Arrays.asList(RESPONSE, SERVICE, REVISION));
// Define the views
View[] views = new View[] {
View.create(View.Name.create("service/dsa-response"), "The distribution of latencies for ds apps",
DS_APP_LATENCY DownStreamAppDistribution, tagKeyAppMetricsList),
View.create(View.Name.create("service/response"), "The distribution of latencies for service",
APP_LATENCY, AppDistribution, tagKeyAppMetricsList),
View.create(View.Name.create("service/app-inbound-traffic"),
"Aggregated sum of request count", INBOUND_COUNT, countAggregation, tagKeyAppMetricsList),
// Create the view manager
ViewManager vmgr = Stats.getViewManager();
// Then finally register the views
for (View view : views) {
vmgr.registerView(view);
}
}
public void collectAppMetrics(HttpServerExchange exchange, Long latency) {
Map<TagKey, String> tagKeyStringMap = new HashMap<>();
tagKeyStringMap.put(REVISION, tm);
tagKeyStringMap.put(RESPONSE, Integer.toString(exchange.getStatusCode()));
tagKeyStringMap.put(SERVICE, tm);
recordTaggedStat(tagKeyStringMap, INBOUND_COUNT, 1);
recordTaggedStat(tagKeyStringMap, APP_LATENCY, latency);
collectDownStreamAppMetrics(exchange);
}
private void collectDSAppMetrics(HttpServerExchange exchange, String appName, String artifactId) {
Map<TagKey, String> tagKeyStringMap = new HashMap<>();
tagKeyStringMap.put(REVISION, artifactId);
tagKeyStringMap.put(RESPONSE, Integer.toString(exchange.getStatusCode()));
tagKeyStringMap.put(SERVICE, appName);
recordTaggedStat(tagKeyStringMap, INBOUND_COUNT, 1);
recordTaggedStat(tagKeyStringMap, DS_APP_LATENCY, elapsedTime);
}
}
private synchronized void recordTaggedStat(Map<TagKey, String> tagColumns, Measure.MeasureLong inboundCount,
long duration) {
TagContextBuilder builder = Tags.getTagger().emptyBuilder();
tagColumns.entrySet().stream().forEach(data -> builder.put(data.getKey(),
TagValue.create(data.getValue()), TagMetadata.create(TagTtl.UNLIMITED_PROPAGATION)));
TagContext tctx = builder.build();
try (Scope ss = Tags.getTagger().withTagContext(tctx)) {
Stats.getStatsRecorder().newMeasureMap().put(inboundCount, duration).record();
}
}
What did you expect to see?
no exception and warning when exporting metrics as timeseries to Stackdriver
What did you see instead?
2019-07-23T14:25:13.860511104Z Jul 23, 2019 2:25:13 PM io.opencensus.exporter.stats.stackdriver.CreateTimeSeriesExporter export
2019-07-23T14:25:13.860524709Z WARNING: ApiException thrown when exporting TimeSeries.
2019-07-23T14:25:13.860527212Z com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: HTTP/2 error code: NO_ERROR
2019-07-23T14:25:13.860529383Z Received Goaway
2019-07-23T14:25:13.860535598Z max_age
2019-07-23T14:25:13.860537899Z at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:69)
2019-07-23T14:25:13.860540176Z at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
2019-07-23T14:25:13.860542165Z at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
2019-07-23T14:25:13.860544374Z at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
2019-07-23T14:25:13.860547232Z at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
2019-07-23T14:25:13.860549235Z at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
2019-07-23T14:25:13.860551183Z at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
2019-07-23T14:25:13.860553166Z at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
2019-07-23T14:25:13.860555328Z at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
2019-07-23T14:25:13.860557426Z at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
2019-07-23T14:25:13.860559778Z at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:507)
2019-07-23T14:25:13.860562506Z at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:482)
2019-07-23T14:25:13.860564581Z at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
2019-07-23T14:25:13.860566750Z at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
2019-07-23T14:25:13.860568750Z at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
2019-07-23T14:25:13.860570897Z at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:699)
2019-07-23T14:25:13.860573898Z at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
2019-07-23T14:25:13.860575926Z at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
2019-07-23T14:25:13.860577898Z at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
2019-07-23T14:25:13.860580017Z at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:397)
2019-07-23T14:25:13.860582127Z at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459)
2019-07-23T14:25:13.860586009Z at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
2019-07-23T14:25:13.860589843Z at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546)
2019-07-23T14:25:13.860592064Z at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467)
2019-07-23T14:25:13.860594155Z at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584)
2019-07-23T14:25:13.860596531Z at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
2019-07-23T14:25:13.860598586Z at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
2019-07-23T14:25:13.860600623Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
2019-07-23T14:25:13.860603144Z at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2019-07-23T14:25:13.860605335Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
2019-07-23T14:25:13.860607598Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
2019-07-23T14:25:13.860609900Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2019-07-23T14:25:13.860611964Z at java.base/java.lang.Thread.run(Thread.java:844)
2019-07-23T14:25:13.860614685Z Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
2019-07-23T14:25:13.860616711Z at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57)
2019-07-23T14:25:13.860618632Z at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
2019-07-23T14:25:13.860620879Z at com.google.cloud.monitoring.v3.MetricServiceClient.createTimeSeries(MetricServiceClient.java:1156)
2019-07-23T14:25:13.860642514Z at io.opencensus.exporter.stats.stackdriver.CreateTimeSeriesExporter.export(CreateTimeSeriesExporter.java:85)
2019-07-23T14:25:13.860645060Z at io.opencensus.exporter.stats.stackdriver.CreateMetricDescriptorExporter.export(CreateMetricDescriptorExporter.java:154)
2019-07-23T14:25:13.860647149Z at io.opencensus.exporter.metrics.util.MetricReader.readAndExport(MetricReader.java:167)
2019-07-23T14:25:13.860649314Z at io.opencensus.exporter.metrics.util.IntervalMetricReader$Worker.readAndExport(IntervalMetricReader.java:171)
2019-07-23T14:25:13.860651881Z at io.opencensus.exporter.metrics.util.IntervalMetricReader$Worker.run(IntervalMetricReader.java:164)
2019-07-23T14:25:13.860654324Z ... 1 more
2019-07-23T14:25:13.860656427Z Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: HTTP/2 error code: NO_ERROR
2019-07-23T14:25:13.860658731Z Received Goaway
2019-07-23T14:25:13.860661012Z max_age
2019-07-23T14:25:13.860663024Z at io.grpc.Status.asRuntimeException(Status.java:532)
2019-07-23T14:25:13.860666044Z ... 22 more
Additional context
This warning with suppressed exception comes after every 1 hr however Stackdriver monitoring charts are not breaking which shows metrics do get exported for other time intervals.. I wanted to know what's the impact of this and how it can be fixed.
This is coming from the server side which closes the connection to the server every 1h most likely for load-balancing reasons. After this exception does the SD exporter reconnect and sends data again or no more data after this?
@bogdandrutu yes after the exception SD exporter is able to reconnect and sends data. So the concern I have is the data not exported and not processed by the server for the time duration for which there is an exception? If that's the case is there any retry mechanism that opencensus-java library offers? We have another application that's facing similar issue but it's less frequent.
If that's the case is there any retry mechanism that opencensus-java library offers?
No our exporter doesn't do explicit retry. Under the hood the Stackdriver MetricServiceClient has default retry settings:
RetrySettings settings = null;
settings =
RetrySettings.newBuilder()
.setInitialRetryDelay(Duration.ofMillis(100L))
.setRetryDelayMultiplier(1.3)
.setMaxRetryDelay(Duration.ofMillis(60000L))
.setInitialRpcTimeout(Duration.ofMillis(20000L))
.setRpcTimeoutMultiplier(1.0)
.setMaxRpcTimeout(Duration.ofMillis(20000L))
.setTotalTimeout(Duration.ofMillis(600000L))
.build();
definitions.put("default", settings);
RETRY_PARAM_DEFINITIONS = definitions.build();
Thus I think this error can be safely ignored.