kiota icon indicating copy to clipboard operation
kiota copied to clipboard

[Multiple languages] properties are generated as nullable/optional, despite required/nullable usage in OAS schema

Open bkoelman opened this issue 2 years ago • 53 comments

Kiota ignores the usage of required/nullable in OAS schemas. C# properties are always generated as nullable. This results in unneeded noisy squiggles, as shown here.

NSwag does a better job, by performing the following translation (using the /GenerateNullableReferenceTypes:true switch):

  • OAS non-required + nullable: C# nullable
  • OAS required + nullable: C# nullable
  • OAS non-required + non-nullable: C# non-nullable, unless combined with the /GenerateOptionalPropertiesAsNullable:true switch
  • OAS required + non-nullable: C# non-nullable

Additionally, NSwag adds [Newtonsoft.Json.JsonProperty(Required = ..., , NullValueHandling = ...)] on the generated properties, which results in client-side serialization errors (thus avoiding a server roundtrip on invalid input).

Example component schema
"exampleComponent": {
  "required": [
    "requiredNonNullableReferenceType",
    "requiredNullableReferenceType",
    "requiredNullableValueType",
    "requiredValueType"
  ],
  "type": "object",
  "properties": {
    "nonNullableReferenceType": {
      "type": "string"
    },
    "requiredNonNullableReferenceType": {
      "type": "string"
    },
    "nullableReferenceType": {
      "type": "string",
      "nullable": true
    },
    "requiredNullableReferenceType": {
      "type": "string",
      "nullable": true
    },
    "valueType": {
      "type": "integer",
      "format": "int32"
    },
    "requiredValueType": {
      "type": "integer",
      "format": "int32"
    },
    "nullableValueType": {
      "type": "integer",
      "format": "int32",
      "nullable": true
    },
    "requiredNullableValueType": {
      "type": "integer",
      "format": "int32",
      "nullable": true
    }
  },
  "additionalProperties": false
}

The motivation for this behavior in Kiota is provided here and here. I find it unfortunate that correct server implementations have to suffer from this.

Please upvote this issue if you'd like Kiota to respect the required/nullable usage in your OAS.

bkoelman avatar Dec 14 '23 11:12 bkoelman

Interesting that you should write about this. I'm currently moving away from AutoRest and have been evaluating Kiota and NSwag and this is the issue that is pushing me to NSwag (despite NSwag's lack of updates and catalogue of open issues). I recently watched the .NET video on Kiota and there are many things to like but unfortunately this issue is the blocker.

Basically the consequence of the current behaviour is that I would need to create a second set of models with corrected nullability which makes no sense. Or make copious manual edits to the auto-generated models.

PS: my preference would be for a "strict nullability" option which produces non-nullables for the case of "OAS required + non-nullable". Nullables for the case of "OAS non-required + non-nullable" are fine and even desirable (as null simply means property absent from serialised object).

markm77 avatar Dec 14 '23 11:12 markm77

Thanks @bkoelman for creating this focused issue and bringing everyone to the discussion table. This could represent a breaking change depending on how we design the change, I've put it to the v3 milestone for the time being, but let's keep the conversation going.

I'd like to explore a solution where we wouldn't need an additional switch on the CLI. Not only this keeps things simple to anybody using kiota, this also reduces the number of cases to implement, test for and support.

I like the solution proposed by @markm77 where, if I understood well, and assuming we don't have any switch, we'd only make as non-nullable properties/return types that are required and non-nullable in the description. This is a very specific case to implement, with fewer ramifications, and potentially non-breaking. Any thoughts on that?

Assuming we go down that route, what would you expect during serialization/deserialization? if an object has a non-nullable property, the default value would be serialized when not set, potentially leading to side effects. If a payload doesn't contain the value for a non-nullable property, the default value would be set, potentially also leading to side effects?

baywet avatar Dec 14 '23 18:12 baywet

Hi everyone, We've talked in depth about the nullable aspect of this during our planning meeting. I'm going to capture the notes here to help fuel the discussion.

on adding a switch

We're pushing back against that aspect, not only for the simplicity as I outlined, but also for the maintainability.

Here is a truth table considering the different variables, which already gives us 8 different permutations. Multiply this by the number of languages, we're effectively in the 50+ permutations to implement, test, and maintain, this would represent too much and we believe that's one of the reason why some other generators out there are struggling.

required nullable flag
0 0 0
1 0 0
0 1 0
1 1 0
0 0 1
1 0 1
0 1 1
1 1 1

on the breaking change aspect

While depending on the project setup for a dotnet developer, this change might or might not become a breaking change, it's clearly one for other languages like Java for instance. And we strive to only break people on major revisions, so this would have to be scheduled accordingly.

Here is an example, let's assume the foo property is boolean and non nullable today in the description. Today in Java the model would look like this.

class Model {
  Boolean foo;
}

and would change to that

class Model {
  boolean foo;
}

The change of capitalization of the type means we've effectively switched from a pointer type (true|false|null) to a value type (true|false) and the code depending on that property would need to change to some extent.

We could alleviate some of that by adding backward compatible code and making use of the ecb switch

class Model {
  @Deprecated
  Boolean foo;
  boolean getFoo() {
    if (foo == null) {
      throw;
   }
   return foo.Value;
  }
//...
}

But this gets ugly really quick.

on the serialization writer and the parsenode interfaces

This effectively means that all primitive methods in this interfaces need to be duplicated. Which would tax the implementers.

public interface ParseNode  {
	bool? GetBooleanValue();
        bool GetNonNullBooleanValue(); // this would need to be added
}

instead of doing that we could add static helpers in abstractions to reduce the burden on code generation and implementers

public static ParseNodeNonNullHelpers {
     public static bool GetNonNullBooleanValue(Func<bool?> getter) {
          if (getter() is bool result) return result;
          throw new InvalidOperationException("the value was null when it wasn't expected");
          // this is helpful since it helps with the validation of the incoming and outgoing payloads
    }
}

on microsoft graph descriptions

While investigating this we discovered the Microsoft Graph OpenAPI descriptions are inaccurate around that topic. They are being generated from CSDl by OpenAPI.net.OData, a library our team also owns. I've created two issues over there to help solve this. https://github.com/microsoft/OpenAPI.NET.OData/issues/466 https://github.com/microsoft/OpenAPI.NET.OData/issues/467

We still need to think further about the implications of required. But feedback on all these notes and the previous reply is most appreciated.

baywet avatar Dec 14 '23 22:12 baywet

I like the solution proposed by @markm77 where, if I understood well, and assuming we don't have any switch, we'd only make as non-nullable properties/return types that are required and non-nullable in the description. This is a very specific case to implement, with fewer ramifications, and potentially non-breaking. Any thoughts on that?

Sounds good to me. This matches NSwag with /GenerateNullableReferenceTypes:true /GenerateOptionalPropertiesAsNullable:true, which is what I've seen being recommended in blog posts. I guess the second switch was introduced later because (some) developers didn't like the initial puristic approach.

Assuming we go down that route, what would you expect during serialization/deserialization? if an object has a non-nullable property, the default value would be serialized when not set, potentially leading to side effects. If a payload doesn't contain the value for a non-nullable property, the default value would be set, potentially also leading to side effects?

This is where the Newtonsoft attributes come in. They drive how to (de)serialize, potentially throwing when expectations aren't met (for example, serializing a non-nullable reference-type property that is null). Kiota may need to introduce something similar, being taken into account during (de)serialization. In NSwag, a non-nullable ref/value-type property that contains its default value isn't serialized because of [Newtonsoft.Json.JsonProperty(Required = ..., , NullValueHandling = ...)] usage. Kiota can provide a superior experience when --backing-store is used, but that's an opt-in feature.

bkoelman avatar Dec 14 '23 22:12 bkoelman

Thanks for the detailed replies @baywet .

Assuming we go down that route, what would you expect during serialization/deserialization? if an object has a non-nullable property, the default value would be serialized when not set, potentially leading to side effects. If a payload doesn't contain the value for a non-nullable property, the default value would be set, potentially also leading to side effects?

Considering just the proposed change case of OAS required + non-nullable, my preference would be:

  • [de-serialisation] if a property is not present, a de-serialisation error results due to it being declared "required" to the de-serialiser (a received null would ideally also trigger an error)
  • [serialisation] the case of the property not being set is blocked by it being required in code (e.g. through the constructor enforcing this or by some other means such as the new C# required keyword). In no case should null or a default/unset value be serialised.

markm77 avatar Dec 14 '23 22:12 markm77

Related to this, NSwag goes beyond just scalar properties. I've seen cases where it generates:

public class Customer
{
    [Newtonsoft.Json.JsonProperty( ... )]
    public CustomerAddress Address { get; set; } = new CustomerAddress()
}

bkoelman avatar Dec 14 '23 23:12 bkoelman

Thanks everyone for the input. I wouldn't design anything depending on the backing store. While the backing store is a nice feature in itself, it's orthogonal and optional.

@bkoelman I don't think you're suggesting taking a dependency on newtonsoft annotations, but I want to make it clear to other readers: we wouldn't go that way. Not only because our canonical JSON implementation relies on system.text.json instead, but also because it'd go against one of our key principles of having the generated outcome not depend on any specific serialization format/library.

As for erroring our on serialization/deserialization when the property is null/missing (regardless of whether it's scalar or not), I think we're on the same page, the helper methods design I eluded to in my earlier reply could also be used for serialization.

Thanks everyone for the feedback. Besides further discussions on required, I think we have a narrow enough improvement and a clear design for the nullability aspect.

Don't hesitate to further the discussion on the required aspect.

baywet avatar Dec 15 '23 13:12 baywet

I don't think you're suggesting taking a dependency on newtonsoft annotations

That's true. It's just an implementation detail in NSwag to make Newtonsoft do the right thing. It would be fine if Kiota handled this differently. The point is that the serialization layer becomes aware somehow.

As for erroring our on serialization/deserialization when the property is null/missing (regardless of whether it's scalar or not)

I tried to express that NSwag allocates an empty instance on required/non-nullable non-scalar properties, though I did not explicitly mention that. Doing so is merely a convenience, but if Kiota is going to support that as well, it may impact how the backing store works (it still needs to track the assignment).

bkoelman avatar Dec 17 '23 13:12 bkoelman

The primary problem is that the use of required/nullable in OpenAPI is broken. We have discussed this extensively in the OpenAPI meetings and with the JSON Schema maintainers. Most people want to describe a single "type schema" for their CRUD operations, but OpenAPI does not provide the facility to vary the constraints of a JSON Schema depending if you are creating, reading or updating a resource.

What most people want the majority of the time is x-RequiredForCreate on a property. It is rare for a property to be required when doing a PATCH. It gets tricky on upserts because there is no way a client can validate whether a property is required because it doesn't know if the PUT/PATCH is going to do a create or an update. Also, to be pedantic, we don't actually ever know when a request is doing a create until you get a 201 back. The POST method doesn't guarantee anything is being created. Are there other scenarios where "required" is useful other than during creation?

Unfortunately, we are all trying to make the best of a bad situation. NSwag's approach to providing a bunch of switches so the user can choose what they want is one way to do this. As @baywet has said, we are trying really hard to avoid taking the route of adding 101 switches in order to maintain simplicity and reduce the potential for bugs.

Respecting non-nullable for required fields will work for some APIs, but it becomes quite problematic for APIs that allow projections on GET requests. If I do GET /patients?fields=firstName,LastName and the patient type has a non-nullable & required dateOfBirth field then we are going to run into problems on the client in strongly typed languages.

This problem does need to be fixed. But it needs to be fixed in OpenAPI/JSON Schema land before we can address it properly in generated clients. This is primary reason why we have taken a very loose approach to dealing with payload data constraints up to this point.

darrelmiller avatar Jan 02 '24 21:01 darrelmiller

The primary problem is that the use of required/nullable in OpenAPI is broken.

I disagree. It is well-defined, just not so convenient for simplistic unstructured APIs, like the default Web API template in Visual Studio, where models are exchanged directly without adhering to any well-defined protocol. Developers often start out that way, only later realizing all the benefits of a structured approach such as JSON:API that takes away the guesswork from consumers. Using JSON:API (and probably GraphQL too) doesn't have any of the issues you mentioned. The problem that needs to be fixed is in Kiota (which doesn't respect the schema), not in OpenAPI itself. I'm not interested in your plans to incorporate into OpenAPI what other standards already provide. Professional APIs like Stripe and GitHub have overcome these concerns long ago, it's just unexperienced API developers running into these.

And yes, once a structure is involved, there are cases where required non-nullable occurs in PATCH request bodies, see https://jsonapi.org/format/#crud-updating. But also at many other places, like error objects being returned, identifier objects that must consist of type/id etc. The model properties you're referring to are just a small part of the payload.

Most people want to describe a single "type schema" for their CRUD operations

I'm not convinced of that. Do you have any evidence? We deliberately chose not to, which solves the post/patch and ?fields issues.

Today the nullability/required handling is superior in NSwag and Kiota feels clumsy in this regard. We don't need any switches in Kiota, just implement how NSwag works when both switches are on (which is what everybody uses in practice).

bkoelman avatar Jan 02 '24 23:01 bkoelman

Thanks @darrelmiller for your input.

Are there other scenarios where "required" is useful other than during creation?

Please remember we are talking about client-side models where personally the primary thing I want to "validate" is server responses (this is far more important to me than requests because I completely control those - although validation is useful there too).

I want to "fail fast" (ideally during deserialisation) if a server does not return a required non-nullable field in response to e.g. a POST or GET. I also want to have a non-nullable in my client-side model for such a field as the rest of my program operates on the assumption that the required non-nullable field has been supplied.

It strikes me also that the issues you mention can potentially be resolved by editing the OpenAPI document (e.g. creating an extra "type schema"). Before using an Open API client generator I usually need to do this to some extent anyway. But the problem here is incorrect nullability in the models produced which is basically impossible to fix without extensive work or duplicate models.

markm77 avatar Jan 03 '24 18:01 markm77

Most people want to describe a single "type schema" for their CRUD operations

Are you sure about that? Same as @bkoelman, we also deliberatelly chose to use different models. Even if they are 100% the same. We had one to many bugs, where a shared model got altered and broke another endpoint.

saithis avatar Feb 16 '24 19:02 saithis

Any updates on this issue?

yasaichi avatar May 01 '24 05:05 yasaichi

After evaluating a number of alternatives, I chose Kiota for our client APIs and have been generally happy with it (thank you for your work on it!). In my experience, making all properties nullable is by far the biggest rough edge I've experienced with it. I've had to add many checks/assertions/branches that wouldn't be necessary, including having to check the OAS schema 10s to 100s of times to ensure that any given property is marked as required.

I would be very happy with generated C# that matches the required/nullable values in the schema. I'd also be happy with adding a switch/line in the lock file to enable it, if backcompat becomes a blocker.

johncrim avatar May 03 '24 04:05 johncrim

Any updates on this? Having all model-properties be nullable makes my client code full of null-checks that I only need to satisfy the compiler.

haefele avatar May 17 '24 08:05 haefele