Unexpected call to beginMessage() in ByteArrayProtoReader32 when exporting logs from disk
Description
I encountered a java.lang.IllegalStateException: Unexpected call to beginMessage() exception when OpenTelemetry's disk buffering feature attempted to export stored log batches. The issue appears to occur while deserializing log data, specifically within ByteArrayProtoReader32 and the ProtoAdapter.decode() method.
Stack Trace
Fatal Exception: java.lang.IllegalStateException: Unexpected call to beginMessage()
at com.squareup.wire.ByteArrayProtoReader32.beginMessage(ByteArrayProtoReader32.kt:57)
at com.squareup.wire.ProtoReader32AsProtoReader.beginMessage(ProtoReader32AsProtoReader.kt:1)
at io.opentelemetry.diskbuffering.proto.common.v1.AnyValue$ProtoAdapter_AnyValue.decode(AnyValue.java:1)
at io.opentelemetry.diskbuffering.proto.common.v1.AnyValue$ProtoAdapter_AnyValue.decode(AnyValue.java:2)
at io.opentelemetry.diskbuffering.proto.common.v1.KeyValue$ProtoAdapter_KeyValue.decode(KeyValue.java:1)
at io.opentelemetry.diskbuffering.proto.common.v1.KeyValue$ProtoAdapter_KeyValue.decode(KeyValue.java:2)
at io.opentelemetry.diskbuffering.proto.logs.v1.LogRecord$ProtoAdapter_LogRecord.decode(LogRecord.java:1)
at io.opentelemetry.diskbuffering.proto.logs.v1.LogRecord$ProtoAdapter_LogRecord.decode(LogRecord.java:2)
at io.opentelemetry.diskbuffering.proto.logs.v1.ScopeLogs$ProtoAdapter_ScopeLogs.decode(ScopeLogs.java:1)
at io.opentelemetry.diskbuffering.proto.logs.v1.ScopeLogs$ProtoAdapter_ScopeLogs.decode(ScopeLogs.java:2)
at io.opentelemetry.diskbuffering.proto.logs.v1.ResourceLogs$ProtoAdapter_ResourceLogs.decode(ResourceLogs.java:1)
at io.opentelemetry.diskbuffering.proto.logs.v1.ResourceLogs$ProtoAdapter_ResourceLogs.decode(ResourceLogs.java:2)
at io.opentelemetry.diskbuffering.proto.logs.v1.LogsData$ProtoAdapter_LogsData.decode(LogsData.java:1)
at io.opentelemetry.diskbuffering.proto.logs.v1.LogsData$ProtoAdapter_LogsData.decode(LogsData.java:2)
at com.squareup.wire.ProtoAdapter.decode(ProtoAdapter.kt:2)
at com.squareup.wire.ProtoAdapter.decode(ProtoAdapter.kt:16)
at io.opentelemetry.contrib.disk.buffering.internal.serialization.deserializers.LogRecordDataDeserializer.deserialize(LogRecordDataDeserializer.java:6)
at io.opentelemetry.contrib.disk.buffering.internal.exporter.FromDiskExporterImpl.lambda$exportStoredBatch$0(FromDiskExporterImpl.java:43)
at io.opentelemetry.contrib.disk.buffering.internal.storage.files.ReadableFile.readAndProcess(ReadableFile.java:45)
at io.opentelemetry.contrib.disk.buffering.internal.storage.Storage.readAndProcess(Storage.java:2)
at io.opentelemetry.contrib.disk.buffering.internal.storage.Storage.readAndProcess(Storage.java:1)
at io.opentelemetry.contrib.disk.buffering.internal.exporter.FromDiskExporterImpl.exportStoredBatch(FromDiskExporterImpl.java:37)
at io.opentelemetry.contrib.disk.buffering.LogRecordFromDiskExporter.exportStoredBatch(LogRecordFromDiskExporter.java:2)
at io.opentelemetry.android.features.diskbuffering.SignalFromDiskExporter.exportBatchOfLogs(SignalFromDiskExporter.kt:8)
at io.opentelemetry.android.features.diskbuffering.SignalFromDiskExporter.exportBatchOfEach(SignalFromDiskExporter.kt:12)
at io.opentelemetry.android.features.diskbuffering.scheduler.DefaultExportScheduler.onRun(DefaultExportScheduler.kt:9)
at io.opentelemetry.android.internal.services.periodicwork.PeriodicRunnable.run(PeriodicRunnable.kt:6)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:919)
Steps to Reproduce:
- Enable disk buffering in OpenTelemetry Android SDK.
val diskBufferingConfig = DiskBufferingConfiguration.builder()
.setEnabled(true)
.build()
- Generate log events and store them on disk.
- Wait for the log export process to trigger.
- Observe the crash when logs are being read and deserialized.
Versions:
opentelemetry-android:0.8.0-alpha- Android OS: 10
Hi @ploszczyca-spoton
This seems to be related to this fix which should be available in the disk-buffering artifact version 1.43.0-alpha.
It should become available in opentelemetry-android in the next release, though you should still be able to use the latest disk-buffering version in your project in the meantime, by adding it explicitly alongside opentelemetry-android, like so:
dependencies {
implementation("io.opentelemetry.android:android-agent:0.9.1-alpha")
implementation("io.opentelemetry.contrib:opentelemetry-disk-buffering:1.43.0-alpha")
}
Which should override the older version provided in opentelemetry-android.
Hi @LikeTheSalad,
Thank you for the response! Great to hear that the issue is already addressed. The crash doesn’t occur often, so I will wait for the next opentelemetry-android release with disk-buffering version bump.
@ploszczyca-spoton Sorry for the delay! We managed to get v0.12.0 released today. Can you try that one out and let us know if it resolves this for you? Thanks thanks.
Note: I closed #997 because this does appear to be fixed. Please let us know. Thanks.
This has been automatically marked as stale because it has been marked as needing author feedback and has not had any activity for 21 days. It will be closed automatically if there is no response from the author within 14 additional days from this comment.
@breedx-splk Were we able to find any reproduction steps on that? We've tried the 0.12.0 as well and we are seeing significant number of similar crashes still. We didn't find the reproduction scenarios, but maybe you have some suggestions on that matter.
Can you try running the latest demo-app from this repo and see if this problem is still happening for you? It's been a bit since I've run that app, but I'm quite sure that I've never seen that same issue in the demo app....and it uses disk buffering.
Can you share anything else that might be unusual about your configuration? I also wonder if there's just old/stale data sitting on disk that needs to be purged. Like, maybe something is trying to read old/corrupt data and when it fails that data stays around to continue causing problems. Maybe?
It still occurs in 0.13.0. Sadly, we have not been able to reproduce it so far.
I analyzed fix https://github.com/open-telemetry/opentelemetry-java-contrib/pull/1571:
@Override
public List<LogRecordData> deserialize(byte[] source) throws DeserializationException {
try {
return ProtoLogsDataMapper.getInstance().fromProto(LogsData.ADAPTER.decode(source));
} catch (IOException e) {
throw new DeserializationException(e);
}
}
Problem there is that issue in LogsData.ADAPTER.decode(source) throws IllegalStateException, not IOException:
Fatal Exception: java.lang.IllegalStateException: Unexpected call to beginMessage()
at com.squareup.wire.ByteArrayProtoReader32.beginMessage(ByteArrayProtoReader32.kt:92)
at com.squareup.wire.ProtoReader32AsProtoReader.beginMessage(ProtoReader32AsProtoReader.kt:31)
at io.opentelemetry.diskbuffering.proto.logs.v1.ResourceLogs$ProtoAdapter_ResourceLogs.decode(ResourceLogs.java:215)
at io.opentelemetry.diskbuffering.proto.logs.v1.ResourceLogs$ProtoAdapter_ResourceLogs.decode(ResourceLogs.java:177)
at io.opentelemetry.diskbuffering.proto.logs.v1.LogsData$ProtoAdapter_LogsData.decode(LogsData.java:155)
at io.opentelemetry.diskbuffering.proto.logs.v1.LogsData$ProtoAdapter_LogsData.decode(LogsData.java:124)
Therefore, the exception is not caught. It is also not mapped to DeserializationException and caught again in FromDiskExporterImpl.java
The previous issue https://github.com/open-telemetry/opentelemetry-android/issues/457 was also related to an IllegalArgumentException, but different one:
java.lang.IllegalArgumentException: java.net.ProtocolException: Unexpected tag 0 at io.opentelemetry.contrib.disk.buffering.internal.serialization.deserializers.SpanDataDeserializer.deserialize(SpanDataDeserializer.java:28)
at io.opentelemetry.contrib.disk.buffering.internal.exporter.FromDiskExporterImpl.lambda$exportStoredBatch$0$io-opentelemetry-contrib-disk-buffering-internal-exporter-FromDiskExporterImpl(FromDiskExporterImpl.java:59) at io.opentelemetry.contrib.disk.buffering.internal.exporter.FromDiskExporterImpl$$ExternalSyntheticLambda0.apply(D8$$SyntheticClass:0)
That one was thrown inside opentelemetry code, not in the ByteArrayProtoReader32 (and is now replaced by DeserializationException)
(I will submit PR to https://github.com/open-telemetry/opentelemetry-java-contrib, but opinion on that would be great ;) )
Hi @JacekPietrasSpotOn, what you mentioned about IllegalArgumentException sounds reasonable 👍
(I will submit PR to https://github.com/open-telemetry/opentelemetry-java-contrib, but opinion on that would be great ;) )
Sounds good, thanks!
That fix in contrib looks good, I approved it and it's in the merge queue, but we just missed the release window, so it'll be out next month. Thanks again 👍🏻
This has been automatically marked as stale because it has been marked as needing author feedback and has not had any activity for 21 days. It will be closed automatically if there is no response from the author within 14 additional days from this comment.