azure-sdk-for-net icon indicating copy to clipboard operation
azure-sdk-for-net copied to clipboard

[Azure Core] ObjectSerializer for AMQP

Open jsquire opened this issue 3 years ago • 1 comments
trafficstars

Problem statement

To enable interchange of data for our AMQP-based libraries across process boundaries or between languages, it is necessary to transform data into a neutral format that can be understood by all participants involved in the exchange. In many such scenarios, JSON, XML, YAML, or other text-based formats are used. In this case, the underlying data exists in the form of an AmqpAnnotatedMessage and the standard AMQP message format is expected to be understood by participants in the interchange.

The Azure Messaging libraries rely on an external dependency, Microsoft.Azure.Amqp, for network transport and protocol-related functionality. The transport library owns responsibility for serialization/deserialization of data from the AMQP message format but does so internally as part of network operations. There is no public API or type available to allow access to AMQP serialization. Additionally, the transport library was written before there was a large-scale awareness of the performance impact of allocations in .NET, using an approach that can be improved with modern tooling.

Example scenario

In the new Azure Functions isolated process model, the Functions runtime exists in one process and the language worker in another. The runtime hosts the triggers that receive data used for invoking the developer-provided Function application code. The Function application code is executed by the language worker in a separate process. For the application to be invoked, data must be passed across the process boundary from the Functions runtime to the language worker. Likewise, to interact with output bindings, data must be passed by the Function application across the process boundary to the runtime.

Proposed solution

The Azure SDK defines an abstract template for serialization concerns in the form of the Azure.Core ObjectSerializer. Concrete implementations exist for JSON as JsonObjectSerializer and NewtonsoftJsonObjectSerializer.

An AmqpObjectSerializer will created for the Azure.Core.Amqp, allowing for translation between AmqpAnnotatedMessage and the standard AMQP message format. The serializer will conform to the AMQP specification, allowing it to be understood by AMQP-aware libraries across languages.

Scope of work

  • In the Azure.Core.Amqp package, create an AmqpObjectSerializer derived from ObjectSerializer capable of translating between .NET types and the standard AMQP message format.

  • When provided an AmqpAnnotatedMessage for serialization, translate it into the corresponding AMQP message format.

  • When provided another object type for serialization, throw a NotSupportedException.

  • When deserializing and provided a return type other than AmqpAnnotatedMessage for deserializing, throw a NotSupportedException.

  • When deserializing, translate the AMQP message format into the corresponding AmqpAnnotatedMessage.

Success Criteria

  • The AmqpObjectSerializer has been created and offers the functionality described in the scope of work.

  • The tests necessary for validation have been created or adjusted and pass reliably.

  • The existing test suite continues to produce deterministic results and pass reliably.

References and Related

jsquire avatar Sep 16 '22 20:09 jsquire

While reading through the issue, I stumbled over the following

The transport library owns responsibility for serialization/deserialization of data from the AMQP message format but does so internally as part of network operations.

is this true? My understanding was the messages can be roundtripped completely in-memory as demonstrated by the AmqpMessageTests. There is no InternalsVisibleTo so these tests are using entirely the public API surface to roundtrip AmqpMessage (unless I have missed something).

Assuming this would work Azure.Core.Amqp could take a dependency on the Microsoft.Azure.Amqp library and then the serializer could leverage the same roundtrip approach to serialize and deserialize the messages without having to reinvent the wheel. The downside would be that it lock steps EventHub and ServiceBus into using the same Microsoft.Azure.Amqp library version. An alternative could be to reimplement a library specific serializer within the EventHub and ServiceBus library, which already have a dependency to the AMQP library.

This leaves us with

Additionally, the transport library was written before there was a large-scale awareness of the performance impact of allocations in .NET, using an approach that can be improved with modern tooling.

this is a fact but already today an issue with the Microsoft.Azure.Amqp library which both EventHubs and ServiceBus .NET SDK rely on. Is that enough justification to deviate for this specific case to reinvent the wheel and potentially deal with deviations of the serializer in Azure.Core.Amqp vs how EventHubs and ServiceBus use the AMQP library at runtime and all potential bugs that might need to get fixed? Or is the idea to pave the way forward for Azure.Amqp.Core to be the new AMQP library in the future?

I would also argue keeping Microsoft.Azure.Amqp allows to liaison further and make sure these modern techniques get applied there. Once that is done, EventHubs, ServiceBus and the serializer automatically benefit from these improvements.

danielmarbach avatar Sep 17 '22 16:09 danielmarbach

Thanks, @danielmarbach. You're absolutely right; we had overlooked the serialization test and our read on the code was incorrect. Using the test as inspiration, I was able to prototype the round-trip serialization with a binary file as intermediate storage using the existing hotfix branch.

I'm going to close this out, since we don't have a more general use case where eliminating the reference to Microsoft.Azure.Amqp is beneficial.

jsquire avatar Sep 26 '22 19:09 jsquire