piccolo icon indicating copy to clipboard operation
piccolo copied to clipboard

`create_pydantic_model` returns models with wrong fields marked as `null`

Open HakierGrzonzo opened this issue 1 year ago • 5 comments

When creating pydantic models with create_pydantic_model function, columns marked as null are not marked as null in the data. Instead the function uses the required property.

How to recreate

Consider the following application:

from piccolo.table import Table
from piccolo.utils.pydantic import create_pydantic_model
from piccolo import columns

# We need something that consumes the Pydantic models
from fastapi import FastAPI

import json

class SomeThing(Table):
    name = columns.Text(null=False)
    pet_name = columns.Text(null=True)
    age = columns.Integer(required=True, null=True)

some_model = create_pydantic_model(SomeThing)

app = FastAPI()

@app.get('/')
def foo() -> some_model:
    pass

# We print the OpenAPI schema generated for table `SomeThing`
print(json.dumps(app.openapi()['components']['schemas']['SomeThing']))

The output returned is:

{
  "properties": {
    "name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Name",
      "extra": {
        "nullable": false,
        "secret": false,
        "unique": false,
        "widget": "text-area"
      }
    },
    "pet_name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Pet Name",
      "extra": {
        "nullable": true,
        "secret": false,
        "unique": false,
        "widget": "text-area"
      }
    },
    "age": {
      "type": "integer",
      "title": "Age",
      "extra": {
        "nullable": true,
        "secret": false,
        "unique": false
      }
    }
  },
  "type": "object",
  "required": [
    "age"
  ],
  "title": "SomeThing",
  "extra": {}
}

As we can see, the fields name and pet_name are marked as str | None, despite having different null options set.

In contrast, the age field is marked as an int, which may never be None, despite it being possibly null in the database.

What causes this behavior:

https://github.com/piccolo-orm/piccolo/blob/7373a8490c84e3c1577d987a34dbc376474c2c30/piccolo/utils/pydantic.py#L246

In the above line, the type for the field is set as T | None based on the required property, instead of the null property.

This generates plausible models for writing to the database, but not for reading data from the database (we get rogue nullable properties).

Why does this behavior need to be changed:

Let's consider a simple change to the example above:

from uuid import uuid4
from piccolo.table import Table
from piccolo.utils.pydantic import create_pydantic_model
from piccolo import columns

from fastapi import FastAPI

import json

class SomeThing(Table):
    id = columns.UUID(default=uuid4(), null=False)

some_model = create_pydantic_model(SomeThing)

app = FastAPI()

@app.get('/')
def foo() -> some_model:
    pass

print(json.dumps(app.openapi()['components']['schemas']['SomeThing']))

We have an autogenerated uuid column, and we would like to create a model that describes a row in that table. We cannot mark id as required as it might be generated by the database.

The generated OpenAPI schema is:

{
  "properties": {
    "id": {
      "anyOf": [
        {
          "type": "string",
          "format": "uuid"
        },
        {
          "type": "null"
        }
      ],
      "title": "Id",
      "extra": {
        "nullable": false,
        "secret": false,
        "unique": false
      }
    }
  },
  "type": "object",
  "title": "SomeThing",
  "extra": {}
}

This causes various OpenAPI client generators to generate the following type:

interface SomeThing {
  id: string | null
}

This makes it seem that the id might be null, when it will never be null.

How to fix this bug:

For my purposes, I just run this patch:

https://github.com/piccolo-orm/piccolo/commit/64629ca372ea92dad9321106c4579ff3845f9673

This causes piccolo to generate the correct models for read operations.

For write purposes, I just use all_optional.

HakierGrzonzo avatar Dec 13 '24 11:12 HakierGrzonzo

We have an autogenerated uuid column, and we would like to create a model that describes a row in that table. We cannot mark id as required as it might be generated by the database.

@HakierGrzonzo I just have one question? Why adding a required argument is problematic, can you please give some example? An required argument is for writing to the database, and that will ensure that we do not have null values ​​in the database when selecting data. Unless you have an existing database, which rows already have null values. In that case, we can add null=True in the Piccolo table definition. I'm sorry if I don't understand well and miss your point.

sinisaos avatar Dec 15 '24 14:12 sinisaos

I just have one question? Why adding a required argument is problematic, can you please give some example? An required argument is for writing to the database, and that will ensure that we do not have null values ​​in the database when selecting data.

@sinisaos Using required for specifying possibly null values in the schema fails in some ways:

  1. When a developer specifies null=False, required=False we get a lot of T | null values downstream, despite the database guaranteeing that the data will never be null (and any INSERT containing null will fail). This is a skill issue that happens very often (sample size=8, skill issue rate of 100%).
  2. When using PostgreSQL insert/update triggers, it is possible to add additional values to the INSERT query. This makes it so the field is NOT NULL in the database AND it is required=False in the piccolo schema.

Reason no.1 is common, annoying, but it is work-around-able.

Reason no.2 stems from piccolo not really working with database-driven defaults (see #919).

In conclusion, when generating the pydantic models, piccolo should treat the database as a source of truth for the can this field be null. The current required behavior when e.g. INSERTing data is ok, but when generating the models they should follow the database schema.

HakierGrzonzo avatar Dec 15 '24 15:12 HakierGrzonzo

@HakierGrzonzo Thank you for your time and detailed explanation. You are right that if we use required=False we get T | null, but I assumed if we want a required field that is NOT NULLwe should use required=True and null=False like this.

def generate_uuid():
  return str(uuid.uuid4())

class SomeThing(Table, db=DB):
  uuid = columns.UUID(required=True, default=generate_uuid, null=False)
  # uuid = columns.UUID(required=True, default=generate_uuid) # null is False by default

I'm not sure if it's a good idea, but maybe we can force required to be True if null is False, to prevent user errors by changing this line

https://github.com/piccolo-orm/piccolo/blob/7373a8490c84e3c1577d987a34dbc376474c2c30/piccolo/columns/base.py#L504

to this

required=True if not null else required,

Thanks again.

sinisaos avatar Dec 15 '24 16:12 sinisaos

I'm not sure if it's a good idea, but maybe we can force required to be True if null is False, to prevent user errors by changing this line

This would fix the mentioned issue, thanks for responding.

HakierGrzonzo avatar Dec 19 '24 20:12 HakierGrzonzo

Sorry for the slow reply on this.

I agree that the current behaviour is unintuitive. I remember people mentioning it in the past.

I've opened a PR, where null determines if it's optional or not:

class Band(Table):
    name = Varchar(null=True)

>>> create_pydantic_model(Band).model_fields['name'].annotation
Optional[str]

class Band(Table):
    name = Varchar(null=False)

>>> create_pydantic_model(Band).model_fields['name'].annotation
str

But required still works - it overrides null. The database column might be nullable, but if you still want the user to provide a value you can do this:

class Band(Table):
    name = Varchar(null=True, required=True)

>>> create_pydantic_model(Band).model_fields['name'].annotation
str

If you use all_optional it overrides everything, and the field is always Optional.

What do you both think?

It's potentially a breaking change, but because the behaviour of required is staying the same (i.e. it still takes precedent over null), hopefully it should be OK for most users.

dantownsend avatar Jan 14 '25 18:01 dantownsend