Support for Serialization and De-serialization of Metrics and Logs using Spring Data
Is your feature request related to a problem? Please describe.
The current metric (potentially also log) types of OTel Java Instrumentation do not support serialization / de-serialization and persistence using Spring Data. The persistence fails due to the classes of OTel Java Instrumentation not having been compiled with the -parameters compiler flag and partially because they don't always provide a default constructor.
Persistence of metrics and logs becomes relevant for scenarios where the metrics and logs must not be lost, i.e. some delivery guarantees need to be given. Examples of such scenarios are logs used for audit logging, or metrics used in commercial metering environments.
Please also see this discussion thread I had opened some time ago: https://github.com/open-telemetry/opentelemetry-java/discussions/7554
Describe the solution you'd like
The OpenTelemetry classes for metrics and logs, potentially also trace spans should be serializable / de-serializable and support persistence via Spring Data. This will allow application developers wanting to use OpenTelemetry in scenarios where delivery guarantees are required (e.g. audit logging, metering) to temporarily persist OTel data and export it later, in case of temporary error situations that make sending the data impossible.
Describe alternatives you've considered Serializing classes via frameworks like Jackson to XML or JSON and persisting that instead. Still need to try out, if this would be a valid option.
Additional context More details, also on what would have to change in the OTel Java instrumentation to support this can be found here.
Tip: React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.
There is a module for telemetry serialization: https://github.com/open-telemetry/opentelemetry-java-contrib/tree/main/disk-buffering
The SpanData, MetricData, LogRecordData interfaces are only meant to be in-memory representations of those data types. They are intentionally opinionated about serialization, which is a task normally reserved for SpanExporter, MetricExporter, LogRecordExporter.
However, OpenTelemetry does define a standard serialization format for these data types in the form of OTLP, and our Otlp(Grpc|Http)(Span|Metric|LogRecord)Exporters internally use this marshaling code to perform this serialization. We hand roll the serialization to improve performance and minimize dependencies. Notably, there is no corresponding deserialization code since OTLP exporters don't need to do that.
If you generate bindings from the OTLP proto definitions (or use those from opentelemetry-proto-java), its much easier to serialize (although less performant and with extra dependencies). All you need to do is perform the mapping from SpanData, MetricData, LogRecordData to the respective proto bindings and use the built-in binary or JSON serialization tools from the proto library dependency. This is what the disk buffering library @zeitlinger links to is doing.
I'm open to making the internal hand-rolled serialization code public, but as mentioned, there is equivalent hand-rolled deserialization code. So someone would have to do that tedious / careful work, and we'd have to decide where it should live. Its a bit of an odd thing to publish because its not actually needed by anything in opentelemetry-java.
Hi @jack-berg thanks for your reply. I am not sure I totally understood what you meant by mapping „SpanData, MetricData, LogData to the respective proto bindings“.
I also want to re-iterate that I am not just concerned with serialization but also deserialization. I would like to be able to temporarily store / outsource the in-memory presentations of metrics (but likely also logs, spans) to a persistent storage and be able to deserialize them from that storage into memory again later. My goal is to be able to store the metrics, logs and span data for longer and beyond the lifetime of an app potentially, to be able to reply those data a OTLP backend. This is to make sure no data gets lost, even if the sink is down for longer and the app might crash before the (in memory) data could be successfully delivered.
My current approach was to use Spring Data to serialize metrics into a DB using a custom MetricsReader, and deserializing from DB using a custom MetricExporter. That only works with modifications in the MetricsData (and SpanData, LogRecordData), though. I could try to serialize using the approach you mentioned, but how would I de-serialize this again into an in-memory form ready to be exported as OTLP?
So how would I best achieve that, i.e. how would I best serialize and de-serialize the data again?
This is exactly what we have the library linked above for. Can you try if it works for you?
This is exactly what we have the library linked above for. Can you try if it works for you?
Excellent! Thanks for confirming. I will give that a try then.