OpenAPI.NET icon indicating copy to clipboard operation
OpenAPI.NET copied to clipboard

Make it possible to have a deterministic order in serialized OpenApi files

Open gturri opened this issue 2 years ago • 1 comments

Is your feature request related to a problem? Please describe. I use OpenAPI.NET to serialize instances of OpenApiDocument and store them in a git repo. So, ideally I would like that when I make a simple change to my API, I get a simple change in the serialized document, in order to have an easy to review git diff.

It is not the case currently, because serialization iterates on some Dictionary and that the order of the iteration is hence not deterministic. For instance, if I add one endpoint to my API, I would like that my serialized OAS file looks like what it was at my previous generation, with just a few added lines. But in practice I may have more diffs, because my paths and my schemas may end up serialized in a different order.

Describe the solution you'd like I can already achieve part of what I would like from my client code in the sense that I can get a deterministic order for the Schemas section, if I do

OpenApiDocument doc = BuildMyDoc();
doc.Components.Schemas = new SortedList<string, OpenApiSchema>(doc.Component.Schemas) // <-- that's the important line
doc.SerializeAsV3(new OpenApiYamlWriter(....));

It works rather fine, in the sense that it leads to a serialized schemas section where all schemas are order alphabetically (making it nice for my git diffs, and making the file easier to browse overall). However, it does not feel really "clean" in itself, and this approach does not work for other parts of the document. In particular for the paths; because they are represented by an OpenApiPaths which ends up extending a Dictionary and I could not find a virtual method or an attribute I could override to achieve a similar deterministic order.

Given this context, a solution that would be nice would be to edit IOpenApiSerializable in order to turn SerializeAsV3(IOpenApiWriter writer) into SerializeAsV3(IOpenApiWriter writer, bool enforceAlphabeticalOrder = false). So for instance OpenApiExtensibleDictionary.SerializeAsV3 could be implemented like

var items = enforceAlphabeticalOrder ? new SortedList<string, T>(this) : this;
foreach(var item in items) {
  ...
}

Having this feature behind a boolean enforceAlphabeticalOrder could make it possible for users interested in this feature to use it, and users not interested in it would not suffer any performance penalty by leaving it to false.

Describe alternatives you've considered As explained above, I tried to achieve this order from my client code, but faced some hard limitations

Additional context If the proposed approach looks good to you, and if it could help, I guess I could try to propose a pull request to implement this.

gturri avatar Aug 17 '23 13:08 gturri

FWIW (in case someone has the same issue and stumble on this feature request), I found a workaround, which is in a nutshell:

  • serialize the OpenApiDocument to a yaml string
  • deserialize this string to a Dictionary with YamlDotNet
  • sort the values of that Dictionary
  • then serialize that Dictionary to get the file I was looking for.

From a code point of view it looks like this:

var oas = ...

// Ensure the schemas are sorted. This can be done directly on the OpenApiDocument
oas.Components.Schemas = new SortedList<string, OpenApiSchema>(oas.Components.Schemas, StringComparer.OrdinalIgnoreCase);

// Now the trick to sort the paths
using var oasWriter = new StringWriter();
oas.SerializeAsV3(new OpenApiYamlWriter(oasWriter));
var yamlDoc = (Dictionary<object, object>) new YamlDotNet.Serialization.Deserializer().Deserialize(new StringReader(oasWriter.ToString()));
yamlDoc["paths"] = ToDictionaryWithSortedKeyStrings((Dictionary<object, object>) yamlDoc["paths"]);

// Now I can serialize back (with YamlDotNet this time)
using var textWriter = new StreamWriter(outPath);
new YamlDotNet.Serialization.Serializer().Serialize(textWriter, yamlDoc);


// A trick here is that we get a IDictionary<object, object> but we know that the keys are string, and we need to
// cast them in order to apply a deterministic order to it (because if I don't use StringComparer.OrdinalIgnoreCase
// I may end up with a different behavior on my laptop and on my CI
private static SortedList<string, object> ToDictionaryWithSortedKeyStrings(IDictionary<object, object> dictionary)
{
    var sorted = new SortedList<string, object>(StringComparer.OrdinalIgnoreCase);
    foreach (var kvp in dictionary)
    {
        sorted.Add(kvp.Key.ToString(), kvp.Value);
    }
    return sorted;
}

That being said, I guess it makes sense to leave this feature request open, because it would be much more convenient if this could be supported directly by OpenAPI.NET

gturri avatar Sep 08 '23 10:09 gturri

Related: https://github.com/dotnet/aspnetcore/issues/59809

mikekistler avatar Feb 28 '25 21:02 mikekistler

@RachitMalik12 Can we slot this in for v2?

I don't think we necessarily have to change anything about the in-memory model to support sorting but the serialization step should attempt do lexogaphic sorting on the path values before seralizing the document.

Similarily, we should probably lexographic sort the elements when in each component type? For example, all schemas get sorted alphabetically by ID and all security schemes as well.

captainsafia avatar Mar 05 '25 19:03 captainsafia

@captainsafia I think I would rather the default behaviour to be to maintain the order provided by the user, but we would be open to providing a Sort() method to reorder the dictionary.

darrelmiller avatar Mar 14 '25 20:03 darrelmiller

Relevant for all dictionary based collections

RachitMalik12 avatar Mar 17 '25 16:03 RachitMalik12

My assumption here was that we'd provide either a comparer or a callback on the settings object so it can be used during the serialization. But @Michael-Wamae 's interpretation was to implement sorting methods here. @darrelmiller can you clarify please?

baywet avatar Apr 08 '25 19:04 baywet

@baywet There are two distinct work items here.

  1. Change the behaviour of dictionaries/lists to maintain the order that items were added.
  2. Add a helper method to allow users to reorder the contents of the dictionaries/lists.

darrelmiller avatar Apr 08 '25 19:04 darrelmiller

To maintain the order we could replace all the Dictionaries we have by OrderedDictionary I thought it was only available for net5+ but it goes all the way back to netstandard2.0, which is good news for us.

As for the helper, I'm assuming this is only relevant during serialization? there are no other scenarios which would require control over ordering from the user? correct?

We have other places where we use lists, they maintain order. We also have hashsets, they don't have an OrderedSet equivalent. Would you leave it as is? or also require to use an ordering collection instead?

baywet avatar Apr 09 '25 16:04 baywet

@baywet OrderedDictionary sounds good. Yes I think the ordering is only necessary during serialization, but why does it matter if we just have a helper method? Are you thinking of doing this a different way, like in the writer settings?

I looked at some places where we have HashSet and I think we may need to change this too. We use it for Tags and considering Tags are used for generating documentation, the order might end up being significant to doc rendering. Are there any downsides to changing from HashSet? Is it just that we lose the uniqueness constraint?

darrelmiller avatar Apr 09 '25 16:04 darrelmiller

@darrelmiller there's something I don't understand here. If we're talking about simply helper methods, and not updating the writer settings, how do you expect people to use them? reorder the DOM and THEN serialize?

As for hashsets, we could simply "implement it" by creating a new HashSet class that effectively stores everything in an OrderedDictionary<T,bool> under the hood, and discard the value all the time. The main downside is the memory footprint and maybe the and increased compute time as well.

baywet avatar Apr 09 '25 17:04 baywet

yes. Most people will create the dom objects and then just serialize in the order they created. However, asp net are generating from code so they want to sort before serializing to minimize diffs if code gets reorganized

darrelmiller avatar Apr 15 '25 12:04 darrelmiller

So they effectively ONLY need a call back during serialization time. It's ok if deserialization-serialization is NOT symmetric in terms of order (assuming no callback is passed). Which means we don't need to change collections. Correct?

baywet avatar Apr 15 '25 12:04 baywet

I can confirm that we only need to order these dictionaries during serialization.

How will this work for arrays, such as the global tags array? We will want to generate this in a deterministic order as well, probably sorted by tag name.

mikekistler avatar Apr 15 '25 14:04 mikekistler

@baywet ASPNET only need it sorted at serialization time. However, we have had requests from other users to have deterministic ordering. I think we should fix the collections so we never have non-deterministic ordering. Having the collections sorted alphabetically is a distinct issue that is the ASP.NET requirement.

darrelmiller avatar Apr 15 '25 15:04 darrelmiller

@mikekistler do you think you'd ever have a requirement to sort collections in a different way? (e.g. tags a specific way, but path items another) As far as I can tell, all OpenAPI model objects have either a "main string property" (e.g. the tag itself), or string key (e.g. media type key, responses key, path items key, etc...). The only annoying exception are schemas. Ignoring that exception for a while, the serialization writer settings could have a property that's a IComparer and this could be used by the library during serialization. What do you think?

If we think this is not enough context to do the sorting, we could instead ONLY replace all collections by ones that maintain order, and let you do the sorting on the object model directly. Doing that would reduce the API surface, also fix the symmetry problem, and give you greater control.

Thoughts?

baywet avatar Apr 15 '25 17:04 baywet

I don't foresee a need for custom sorting options, but if possible I would like to allow a developer to override the sorted collection with an ordered collection. In other words, I think the two options of a) sorted lexicographically by the natural key, and b) a custom ordered dictionary should cover all the bases.

mikekistler avatar Apr 17 '25 18:04 mikekistler

Can you share which specific properties in the open api description you need the capability to be ordered or sorted?

Let's see:

  • Path items ordered by path
  • All properties of components ordered by key
  • Global tags array sorted by tag name
  • ServerObjects in servers ordered by ?? url ??
  • web hooks ordered by name (key)
  • securityRequirements in a security field ordered by name
  • content of requestBody or response object ordered by media-type (key)
  • headers of requestBody or response object ordered by header name (key)
  • links of response object ordered by link name (key)
  • example objects in examples field ordered by example name (key)
  • entries of an encoding field ordered by property name (key)

That's more than I expected, and I might have missed some.

mikekistler avatar Apr 17 '25 18:04 mikekistler

All properties of components ordered by key

Please don't reorder these. They represent something functional that users care about. For example:

public class Employee
{
    public long Id { get; set; }
    public string DisplayName { get; set; }
    public decimal Salary { get; set; }
}

should be emitted as:

- Id
- DisplayName
- Salary

instead of:

- DisplayName
- Id
- Salary

Users expect to find Id first when debugging or reading log files. And that the properties represent what they declared. When ordered alphabetically in OAS, generated clients will send JSON requests with properties in the same alphabetical order. In our server, that results in hitting a sub-optimal code path. Because JsonReader is forward-only, we expect the type property to come first. The remainder of the properties is validated against that. Our suboptimal fallback code reads the request body twice: first to determine the type, and then again to read other properties and validate them against the type.

The remainder of the OAS file is technical, as it does not represent the business entities from an app, so the ordering should not matter much.

bkoelman avatar Apr 28 '25 22:04 bkoelman

Please don't reorder these. They represent something functional that users care about. For example:

This isn't about reorder properties on schemas. It's about ordering in the for top-level items under the components property in the OpenAPI document.

captainsafia avatar Apr 28 '25 22:04 captainsafia

Can the collection interfaces be restored, please? It would enable us to choose between ordered/unordered collections at different places, and pass a custom comparer when specific ordering is needed (specific ordering use case here). Since many collections are nullable now, we need to instantiate them anyway.

We have use cases where we need some nodes sorted alphabetically (culture-insensitive), whereas others must not be sorted to preserve declaration order.

See also the conversation at https://github.com/domaindrivendev/Swashbuckle.AspNetCore/commit/04b52e9828e4410787a1fe3fb8b2d3c402a58877#r156107432.

bkoelman avatar May 09 '25 06:05 bkoelman

It would be great if instead of inheriting from Dictionary<K,V> to inherit from OrderedDictionary<K,V> from an ASP.NET Core openapi perspective. It allows uses to reorder the elements as they wish. For example http methods you probably want to order it as GET, POST, PUT, PATCH. But that should also be up to the user.

When it's an ordered dictionary the user can decide by themselves what the ordering should be by determining the insertion order. By doing this we can use Document, Operation and Schema Transformers (ASP.NET Core) to order it yourself.

I think OrderedDictionary would give an intuitve way for the user to do customize the order within the OpenAPI model, which is something you want to be able to adhere to spectral rules for example.

desjoerd avatar May 26 '25 08:05 desjoerd

@desjoerd We did try that, but we had performance regressions that we didn't want to impose on everyone. By moving back to use of interfaces we give people the option of using something like OrderedDictionary where they want, and we have the option in the future to introduce a reader setting that will use OrderedDictionary by default for those who are ok with the perf tradeoff.

darrelmiller avatar May 27 '25 12:05 darrelmiller