drf-spectacular
drf-spectacular copied to clipboard
Q: Split request generates response with optional fields, intended?
tl;dr: Serializer fields are labeled as optional in the response if they satisfy any of the following:
- nullable
- has a default
- is not readonly
I propose adding a setting that assumes response fields are required unless explicitly decorated with extend schema.
More of a question since this behavior doesn't seem correct.
When COMPONENT_SPLIT_REQUEST=True
, fields for the request component are correct. Fields with defaults or required=False
are set as optional. However, this same behavior seems to apply to the response components.
Here's an example:
from uuid import uuid4
from django.db import models
from rest_framework import serializers
class Address(models.Model):
uuid = models.UUIDField(default=uuid4, primary_key=True)
address1 = models.CharField(max_length=255)
address2 = models.CharField(max_length=255, null=True, blank=True)
class AddressSerializer(serializers.ModelSerializer):
class Meta:
model = Address
fields = ["uuid", "address1", "address2"]
Which generates a response schema that looks something like this:
Address:
type: object
properties:
uuid:
type: string
format: uuid
address1:
type: string
maxLength: 255
address2:
type: string
maxLength: 255
nullable: true
required:
- address1
Meaning the only field that is guaranteed in the response is address1
, since it neither has a default nor is nullable.
Marking the fields as readonly fixes the issue, but of course means we can't update those fields:
class AddressSerializer(serializers.ModelSerializer):
class Meta:
model = Address
fields = ["uuid", "address1", "address2"]
read_only_fields = fields
And yes, uuid
should have been marked readonly from the start, but I wanted to show that it is marked optional because it has a default.
To get the behavior I was expecting I created a postprocessing hook which makes all response fields required:
def all_response_fields_required(result: dict, **kwargs):
for component, data in result["components"]["schemas"].items():
component: str
data: dict
# Only interested in response schemas. If a model name ends with "Request"
# that will need to be handled here
if component.endswith("Request"):
continue
if "properties" not in data:
continue
data["required"] = list(data["properties"].keys())
return result
Disclaimer: If you decide to use this hook in your own project please test it, it's a bit on the hacky side and almost certainly has a bug or two.
This ultimately boils down to a question of whether this behavior is intended or not. I don't see a way to declare a field as non-required for requests but required for responses without making a separate serializer or going the postprocessing route.
In our API we don't have a concept of an optional field though I understand that some APIs do. Because DRF doesn't distinguish between required for requests vs responses this seems like it might need to be an @extend_schema
kind of decorator, whereby fields are assumed to be required in the response unless explicitly labeled as optional.
For backwards compatibility, a setting could be added to control what the default behavior is. ASSUME_RESPONSE_FIELDS_REQUIRED
which defaults to False
, or something along those lines. Setting it to True
makes them always required unless decorated.
tl;dr: This is a finely tuned trade-off between the sometimes under-specified DRF and the usefullness of potentially generarated client code. A lot of thought went into this. Not saying there is no flaws/mistakes, but we have to look very carefully at the impact of required
.
We have to unpack this into multiple cases:
-
COMPONENT_SPLIT_REQUEST=False
(default) request/response: Since the serializer is represented as one component it needs to be a tradeoff. As you noted, there is only onerequired
list and you cannot distinguish between request and response here. -
COMPONENT_SPLIT_REQUEST=True
: This option is basically the attempt to get closer to the actual behavior in the presence of this duality. However, there is still only one serializer to draw info from.
Let's assume for now that we are talking about COMPONENT_SPLIT_REQUEST=True
for ModelSerializer
. (The field default for a regular Serializer
is in fact required=True
)
And yes, uuid should have been marked readonly from the start, but I wanted to show that it is marked optional because it has a default.
@pytest.mark.django_db
def test_modelserializer():
s = AddressSerializer(data={
"uuid": str(uuid.uuid4()),
"address1": "foo"
})
s.is_valid(raise_exception=True) # WILL WORK
s.save()
Nope, that is not correct. models.UUIDField(default=uuid4, primary_key=True)
does not make it read-only. You would need to add editable=False
there. PK does not make a field read-only. Your API will accept a uuid
if specified like that. I would call that security issue.
[...] Because DRF doesn't distinguish between required for requests vs responses this seems like it might need to be an @extend_schema kind of decorator, whereby fields are assumed to be required in the response unless explicitly labeled as optional.
Just to provide additional context, DRF will generally not force you to return all the fields in the serializer response. The "validation" is only running on the request.
Furthermore, we generally draw from the field's required
flag that is provided/generated by DRF. Since there is only one serializer for both cases, there is no directionality to this information.
As a matter of fact we have a setting to turn off required
for read-only as it creates problems with client generators (COMPONENT_NO_READ_ONLY_REQUIRED
). In a sense, you are asking for the opposite behavior to that.
But I'll admit that reusing the required list for response fields is somewhat inconsistent. We just use the info that we do have instead of going all-or-nothing.
You could make the argument that for a ModelSerializer
on responses (and only there), the fields are always present. I don't think we make that distinction yet. But I am also not 100% sure if that always holds true. Some more investigation would be needed.
@tfranzel Thanks for the thoughtful response.
Just to clarify, when I said And yes, uuid should have been marked readonly from the start, but I wanted to show that it is marked optional because it has a default.
, I meant that it should have been marked as readonly in the serializer (or editable=False
) because it's a PK (as you rightly mentioned, not making it readonly is a security issue).
Using a PK field in the example was a poor choice on my part because it distracted from what I was trying to point out, which is that the schema marked it as optional since it had a default.
You could make the argument that for a ModelSerializer on responses (and only there), the fields are always present. I don't think we make that distinction yet. But I am also not 100% sure if that always holds true. Some more investigation would be needed.
I think any kind of assumptions we make regarding what may or may not be serialized will cause problems for someone. If anything, the current behavior is the best approach. It's better to assume fields are optional rather than assume they are required, because as soon as one of those required fields is missing, something is going to break.
Having a standardized way of defining this with spectacular would be spectacular but I think there are decent enough workarounds. The postprocessing hook is a convenient brute force solution, you could make response serializers that mark everything as readonly which guarantees they are included, or maybe do some tweaking of the schema in your client. For example, if you are using the schema to generate interfaces in Typescript, you could wrap the responses with Required<>
.
@Kangaroux thanks a lot for this all_response_fields_required