flask-restx
flask-restx copied to clipboard
Flask-RESTX Models Re-Design
For quite some time there have been significant issues around data models, request
parsing and response marshalling in flask-restx
(carried over from
flask-restplus
). The most obvious of which is the deprecation warning
about the reqparse
module in the documentation that has been in place for far
too long. These changes have been put off for various reasons which I won't
discuss here, however now the new fork is steadily underway I (and no doubt others) would like
to start addressing this.
Since this digs quite deep into the architecture of flask-restx
there will be
significant (and likely breaking) changes required. As such, this issue is to
serve as a discussion around the API we would like to provide and some initial
ideas of how to best proceed. This is not intended to be the starting point of
hacking something together which makes things worse!
I will set out my current thoughts on the topic, please contribute by adding more points and expanding on mine with more discussion.
High Level Goals:
- Uniform API for request parsing and response marshalling
- e.g. remove the separation between
reqparse
andmodels
- e.g. remove the separation between
- Generate correct and valid Swagger/OpenAPI Specifications
- Validation of input and output data should conform to the generated
Swagger/OpenAPI Specifications
- e.g. If the Swagger/OpenAPI spec considers a value valid, the model should too.
- Define models using JSON Schema
- Supported already, but with numerous issues (@j5awry has been battling for some time)
- OpenAPI 3 support
General Issues/Discussion Points
- What should the API look like?
- Continue with the
api.marshal
,api.doc
decorator style? - How to define models?
- Do we force direct usage of another library e.g. Marshmallow or wrap in some other API and use the library for the "under the hood" work?
- Continue with the
- Model validation
- External libraries e.g. Marshmallow
- Schema Generation
- External libraries e.g. Marshmallow
- Backwards compatibility
- Continue to support
reqparse
and existingmodels
interface? - Swagger 2.0 vs OpenAPI 3.0
- IMO generating both should be a goal if possible
- Continue to support
Resources/Notable Libraries
- https://marshmallow.readthedocs.io/en/stable/
- https://github.com/fuhrysteve/marshmallow-jsonschema
- https://pydantic-docs.helpmanual.io/
- Faust Models, Serialization and Codecs https://faust.readthedocs.io/en/latest/userguide/models.html
- Faust is not a Flask or even REST library but I have found it's Models to be a nice interface to use.
- https://github.com/apryor6/flask_accepts
- Swagger 2.0 https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md
- OpenAPI 3.0 https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.2.md
I'm curious about the "High Level Goals". With the exception of defining models using JSON Schema, fastapi https://fastapi.tiangolo.com was created to meet the high level goals. Is RESTX is aiming to become a synchronous version of fastapi?
@vdwees There are many frameworks working toward those high level goals. marshmallow has a setup built on Flask already. there are similar frameworks out there on top of Django, Bottle, Hug, etc (actually, Hug might be compliant out of the box...it's been a little bit since I used it). Some more (ha) examples include marshmallow-code/flask-smorest, connexion, pyramid w/ pyramid_openapi3 plugin, etc.
So, no, i don't think we're a synchronous version of fastapi. If anything, we're closest to flask-smorest, especially if we decide to use marshmallow for generating models. That's part of what makes the Python community so great and frustrating all at once. See a problem? Make a framework or library, and put it out there! did someone else do it something similar? Probably! But that's ok, cause maybe yours does it a little bit differently!
On topic now, we should probably split up doing some research on different modeling libraries, coming up with good pros/cons of each, and then coming to a decision. Some things we should consider:
- does it already generate openapi specs (by itself or with an additional library)
- does it already generate jsonschema (by itself or with an additional library)
- speed!
- maintainability -- is it well maintained? Can we trust it'll be there in a year? five years?
- open nature (can I say i'm partial to the Marshmallow folks cause someone came knocking? Yeah, I can)
We should also consider the "write it ourselves" approach, taking into account the time it'd take to do well, and if we can personally maintain it to the level of other libraries.
Agreed with @j5awry on the fastapi point, these are two different projects and I'm fairly certain flask-restplus existed before fastapi so it anything it's the other way round 😉
I agree with your dissesction there @j5awry , I'm hoping to take a first step in looking at 5. and/or write it ourselves approach this weekend (I was planning to last week but got dragged into the failing CI issues instead 🤷♀️)
Like I said already, I trust you folks on the swagger/openapi part. I believe you know about this subject way more than me.
Anyway, in my view, we will soon be facing a dilemma:
- on one hand, we want to move forward and stop relying on the soon to be deprecated reqparse
- on the other hand, a huge part of the community is still using restplus and we will need to remain "compatible" for some time
So I'd suggest we release v1.0.0 with the removal of python <= 3.5 support and a clear deprecation notice for the requparse usage, but we should keep maintaining a 1.x branch for let's say 1 year where we would ship bugfix and compatible improvements.
And in parallel, we can go ahead and release a v2.0.0 where we simply drop support for reqparse and we start working on an alternative.
My personal opinion is that we should use an external package like Marshmallow for model validation and schema generation. Community around Marshmallow is really good and you can see day to day activity on that project.
I flask-restx since it was forked from the original flask-restplus. You also have a really good community, but I think there are not enough people to cover creating a "new marshmallow", integration with existing Marshmallow should be enough for now.
I'm open for the discussion and I also have a little bit of free time so I can help you with the integration if you choose to go that way.
I've been experimenting a little bit with what API we could provide using. This is a modified version on the todo_blueprint.py
example aiming to explore a possible user-facing API for the models/validation with Marshmallow (not tied to Marshmallow, just where I've decided to start).
The idea is to keep the existing @api.marshal
(for responses) and @api.expect
(for requests) API but use Marshmallow Schema. Behind the scenes, the validation, marshalling and OpenAPI spec generation will take place.
Disclaimer: This is not meant to be a good example of implementing a REST API, just exploring the model definition/marshalling/request parsing API.
Any thoughts on the API? (not so much the underlying implementation)
from uuid import uuid4
from flask import Blueprint, Flask
from marshmallow import Schema, fields
from flask_restx import Api, Resource
api_v1 = Blueprint("api", __name__, url_prefix="/api/1")
api = Api(api_v1, version="1.0", title="Todo API", description="A simple TODO API")
ns = api.namespace("todos", description="TODO operations")
# Class-style schema
class Task(Schema):
description = fields.String(required=True, description="The task details")
class Todo(Schema):
id = fields.String(required=True, description="The todo ID")
task = fields.nested(Task, required=True)
completed: fields.Boolean(required=True)
created_at = fields.DateTime(required=True, format="iso")
completed_at = fields.DateTime(format="iso")
# Dictionary-style schema (unchanged from user API perspective)
Todo = api.model(
"Todo",
{
"id": fields.String(required=True, description="The todo ID"),
"task": fields.nested(Task, required=True),
"complete": fields.Boolean(required=True),
"created_at": fields.DateTime(required=True, format="iso"),
"completed_at": fields.DateTime(format="iso",),
},
)
# Dummy definitions for demonstration purposes
def db_get_todo_or_abort(todo_id: str) -> dict:
""" Fetch Todo from DB by id, abort with `404` if not found. """
pass
@ns.route("/<string:todo_id>")
@api.doc(responses={404: "Todo not found"}, params={"todo_id": "The Todo ID"})
class Todo(Resource):
"""Show a single todo item and lets you delete them"""
@api.marshal_with(Todo)
def get(self, todo_id):
"""Fetch a given resource"""
# Dict returned here is passed through Marshmallow schema for validation
# and marshalling to JSON in the response
data = db_get_todo_or_abort(todo_id)
return data
@api.expect(
Task, location="form",
)
# Could also define inline as a Dict
# @api.expect(
# {"task": fields.String(...)}, location="form",
# )
@api.marshal_with(Todo)
def put(self, todo_id, task):
"""Update the task of a Todo"""
# args automatically validated and passed in as an instance of Task -
# no parser.parse_args
data = db_update_todo(todo_id, task) # Use the Marshmallow schema directly
# Todo Marshmallow schema returned is automatically marhsalled into JSON
# in the response
return Todo(**data)
@ns.route("/")
class TodoList(Resource):
"""Shows a list of all todos, and lets you POST to add new tasks"""
# Define some query params for filtering
@api.expect(
{"completed": fields.Boolean(), "completed_at": fields.DateTime(format="iso")},
location="query",
)
# Marshal a list of a given schema
@api.marshal_list_with(Todo)
def get(self, args):
"""List all todos"""
# Imagine this uses the filters in args properly to filter the result set
# from the DB...
return db_get_all("todo", filters=args)
# Validate args with NewTodo schema then dump to a dict before passing
# into handler
@api.expect(Task, location="json", as_dict=True)
@api.marshal_with(Todo, code=201)
def post(self, args):
"""Create a todo"""
# Just an example, I don't want to start a flame war around which values
# should be used for IDs in a DB!
todo_id = uuid4()
# Do some work with args
task = sanitise_input(args["task"])
data = db_create_todo(str(todo_id), task)
return data, 201
if __name__ == "__main__":
app = Flask(__name__)
app.register_blueprint(api_v1)
app.run(debug=True)
Any comments on that initial API example? I've got some ideas about the implementation however it's going to be a significant amount of work to do properly so I don't want to get started unless we're generally happy with the user-facing API 😊
The core Namespace
/Api
/Swagger
classes are tightly couple to the Model
implementation at the moment so I imagine it's not going to be a simple process 😅 Current extensions to integrate Marshmallow (e.g. flask-accepts) generallt convert the Marshmallow schema to flask-restx models/reqparse objects. However this work will be largely replacing models/reqparse so that we can fix a whole host of issues! 😁
I like the idea though I would need more details on this part:
# args automatically validated and passed in as an instance of Task -
# no parser.parse_args
Do you have an idea of how we can split this work to help you out?
I also have a concern about one specificity of restplus/restx: the Wildcard
field.
It is documented here. @sloria do you know if there is an equivalence in marshmallow?
In marshmallow, the handling of unknown fields is configurable. https://marshmallow.readthedocs.io/en/stable/quickstart.html#handling-unknown-fields
There isn't a built-in way to do the globbing feature of wildcard, but using the above setting you can accept unknown fields.
Also, there's fields.Dict
for nested, unstructured data: https://marshmallow.readthedocs.io/en/stable/api_reference.html#marshmallow.fields.Dict, but I'm not sure that meets the same use case.
Thanks for the quick reply.
My understanding is that unknown fields don't get parsed/validated.
The fields.Dict
look interesting but like you said, I don't think it is what we need either.
I'll run some tests though.
You might also might be interested in Schema.from_dict
, which can be useful for validating fields that are only known at runtime. I wrote a bit about it here. It could also be useful for the dictionary-style API in @SteadBytes 's comment.
So, one thing that's bothered me about "wildcard" is that it's not really what it is in json schema (which is where restx goes before openapi). It's "technically" a regex, right? So I think we need to look at how jsonschema + openapi deal with regexed value. I'm fairly certain what we'll see is that openapi doesn't actually handle regex values for object names (which is to say in my quick search, I'm not seeing that...)
I need to give more time to reading this and looking at things. One thing that might help us ensure we're capturing the correct things is filling out some user stories, then ensuring the interface matches the expectations there.
I don't think such equivalent exist in the specs either and that's what lead to #57.
But I know of several usecases for the Wildcard
fields. Now, since I'm the original author of this feature I can't argue that those usecases are bad so if you think we shouldn't support them, then I'll deal with it.
Here was the original feature request: https://github.com/noirbizarre/flask-restplus/issues/172
Yep @sloria Schema.from_dict
was exactly what I was thinking of for that (I hacked together something using that whilst experimenting) 👍
@ziirish I'm not sure at the moment r.e. if/how the work could be split up. I'm going to take some more time over the weekend to look deeper into implementation (as apposed to just API design). Hopefully then I'll have better idea 👍
@ziirish the use case of wildcard is fine. I think it's an odd implementation issue, and a mismatch on what folks expect to see. I commented on the issue with what I would expect to happen. i think it's just shifting things a bit. a wildcard is nothing but the most open-ended regex possible
i need more dedicated time to think/look at things. Unfortunately I may not have that time soon. Just woke up early at my company "sprint" and it was too rainy to walk for coffee.
It seems like there's already an awful lot of activity around this area (especially in the marshmallow
ecosystem). Specifically flask-smorest
seems to be almost identical to how flask-restx
might work when using marshmallow
. It uses the same existing libraries that we've been considering to do the 'heavy lifting' of request parsing/OpenAPI schema generation e.g. webargs
, apispec
and marshmallow
(along with the extensions each of these provide).
I'm not necessarily suggesting we don't go ahead here, but (personally at least) I'm struggling to justify the (fairly large) effort to properly replace the existing models, request parsing e.t.c. - which would certainly introduce breaking API changes anyway - when such similar projects already exist :thinking: Does anyone else have a perspective on this? Like I say, I'm still happy to try and do this, but I wanted to get people's thoughts in general on whether it's worth it :sweat_smile:
Let's separate this into a few larger main points
- Interface
- Backend Model Implementation
- Requests Parser/Model Implementation
Interface
- Should we keep the interface(s) stable?
- Should we expose the underlying implementation more directly (or allow it) ex: Let's say we remove the current Model implementation and switch to Marshmallow. Do we write an adapter that keeps our interface exactly the same? Do we just say "it's Marshmallow now!" Do we allow a hybrid, where you can use what we have, but we also have a model.Marshmallow that specifically just takes a Marshmallow class
Backend
- Keep the same, find another modeling/parsing/marshalling library, or re-write from the ground-up?
- What input and output formats do we want / are required? Meaning, what format(s) does OpenAPI expect? What options/mappings do we have from input to output?
- What libraries do it for us already? Do they operate in the way we want?
- Is there a way to support OpenAPI 2 and 3?
Requests
- What is the main use of the requests parser? How are people using it now? What do they want out of it? (I'm going to be honest, I never used it)
- Do we merge it into
models
but separate from the autodoc? Meaning create a model how we would normally, but don't register it to the API or Namespace.
Some initial answers from my recent notes made on the topic @j5awry 😊
Interface:
Should we keep the interface(s) stable?
Ideally, yes, but I don't think this is a hard requirement - especially if it makes the implementation much more complex.
Should we expose the underlying implementation more directly (or allow it)
My ideal goal here is to provide some abstraction with which other serde/validation libraries can implement adapters for. E.g. a flask-restx
model represents a contract that can be fulfilled by Marshmallow, pydantic e.t.c. We would likely provide first class support for one of these but users can plug in as necessary.
Backend:
Keep the same, find another modeling/parsing/marshalling library, or re-write from the ground-up?
Definitely not keep the same. As I mentioned previously, there's a lot of existing work in this are of validation/parsing and it makes sense to utilise this ecosystem where possible.
What input and output formats do we want / are required? Meaning, what format(s) does OpenAPI expect? What options/mappings do we have from input to output?
From the perspective of OpenAPI, all that matters is that we produce a valid OpenAPI schema according to the current JSON schema specification. From the end user perspective, this would be represented as JSON or YAML.
From the perspective of Flask-RESTX (as it currently stands at least), the input to this schema is Resource
s, Model
s and various @api.doc
decorators. All of which fundamentally modify the __schema__
property of the objects in question.
This is currently quite convoluted and tightly coupled to the implementation of pretty much everything. One of my desires for this effort is to decouple this.
What libraries do it for us already? Do they operate in the way we want?
Some I have already mentioned:
- Marshmallow
- Webargs
- ApiSpec
- Pydantic
Whether they operate how we want depends on defining exactly what we want 🤣
Is there a way to support OpenAPI 2 and 3?
In theory, I don't see why not. However, IMO, OpenAPI 2 ought to be considered legacy and supported primarily for backwards compatibility.
Requests:
What is the main use of the requests parser? How are people using it now? What do they want out of it? (I'm going to be honest, I never used it)
Difficult to answer this in the general case - is there a way we could get some feedback from users here? Anecdotally, I've primarily used it in the past for validating and parsing filter parameters.
Do we merge it into models but separate from the autodoc? Meaning create a model how we would normally, but don't register it to the API or Namespace.
Im not quite sure what you mean here sorry. If a model exists to define what a request should expect as input then it should be part of the OpenAPI schema.
Hi guys, I'm new here. I would like to give some comments as I use this library.
What is the main use of the requests parser? How are people using it now? What do they want out of it? (I'm going to be honest, I never used it)
Funny to say that I use reqparse only!!! And I'm wondering how you manage to avoid it when you need to filter out your data based on user requests. All public API work this way. The scenario is simple, when you want to expose your data from a datawarehouse, you don't need CRUD ... just filtering data based on your custom logic. So I come with a generic set of parameters needed for each resource .. and depending on the context I extend/remove parameters for specific resources.
So, to me reqparse is very important .. it can be replaced by something else internally but the main features need to stay (validation, location, order, add/remove per resource)
Here I give a few ideas for your consideration. These are mostly related to the fact that in the future openapi will be aligned with json schema, see https://apisyouwonthate.com/blog/openapi-v31-and-json-schema-2019-09.
Currently it is possible to use json schemas to define models both for requests and responses. I mean something like the following:
from flask import Flask
from flask_restx import Api, Resource
app = Flask(__name__)
api = Api(app)
request_schema = get_myendpoint_request_json_schema()
request_model = api.schema_model('mydenpoint_request', request_schema)
response_schema = get_myendpoint_response_json_schema()
response_model = api.schema_model('mydenpoint_response', response_schema)
@api.route('/myendpoint')
class MyEndpoint(Resource):
@api.expect(request_model, validate=True)
@api.doc(model=response_model)
def get(self):
return {}
app.run()
This is very useful in my opinion because json schema is a standard, so there is no need implement anything to define models. Furthermore, data that follows some json schema could be received from some (potentially non-python) source and a flask-restx endpoint could include this as part of its response. In this case extending the json schema from the original data source makes total sense instead of having to define the response model from scratch.
I have looked at alternatives such as marshmallow and fastapi, but none really allow to do as I want with simple json objects and flask-restx. So I surely hope that this feature from flask-restx to define models from json schemas is preserved and even improved.
The json schemas could also be used for filtering when marshalling, by defining a subset schema that defines only the information that should be included in the response. It would be implementing something like https://github.com/uber/json-schema-filter but for python.
I was wondering given the TodoMVC example was there meant to be 2 models one for request payload and one for response.
Alot of cases, there are certain properties that are synthetically generated by the application and shown on the response. But they are not properties that you set during request.
Therefore, are they meant to be 2 different models? Or is there a way to have 1 model and declare what properties are needed during marshal in and which are needed during marshal out?
Oh I now see that the readOnly
is meant to toggl properties that between GET/POST... etc. How come properties that are readOnly
still show up during for the POST method?
DW, I misread it was meant to be readonly
not readOnly
.
Hi! I'm not sure if my inputs will make a difference but here they are -
- Keep the api less verbose, the marshmallow, webargs and apispec setup use to be great but with all those decorators it's less readable.
- Can we draw from FastAPI's beautiful API? I propose we use something less verbose like pydantic, Faust or even python dataclasses if at all.
- A dependency injection feature really helps, even if that's a single layer. By that i mean dependency injection only works for endpoint methods, that's a great feature to have.
I have been trying come up with something similar FlaskEase, it is far from perfect with zero test coverage as of now. I want something similar but with community effort that way it's easy to maintain.
are you guys planning to handle host parameter in other way? https://github.com/python-restx/flask-restx/blob/master/flask_restx/swagger.py#L270
It's difficult when you want to generate openAPI specification and put this into the repository as an individual file, or just test this at https://editor.swagger.io/.
An option to support back words compatibility is to use python reflection with the adapter pattern so you adapt pydantic or any other framework into the current model. I wrote an adapter that works with the last version of flask_restplus
so it should be compatible with Flask-Restx
as long as there where no major changes to the Api models or fields interface.
Its incomplete with all the properties on fields but can extended for full support.
This is a modular approach could allow for pydantic
adapter to be in different pip package. Developers would need a base class like SchemaAdapter
and a way to register the adapter into the instance to override the base class.
""" Flask Adapter
"""
import datetime
import decimal
import re
from typing import List, Union
from flask_restplus import Api, Model, fields
from pydantic import BaseModel
class SchemaAdapter:
""" Example
"""
api: Api
def __init__(self, api: Api):
self.api = api
@staticmethod
def python_to_flask(python_type: type) -> str:
""" Converts python types to flask types
:param type python_type: type that is to be converted into flask type
:return: flask type represented as a string
:rtype: str
"""
if python_type is int:
return 'Integer'
if python_type in [float, decimal.Decimal]:
return 'Float'
if python_type is bool:
return 'Boolean'
if python_type is datetime.datetime:
return 'DateTime'
if python_type is datetime.date:
return 'Date'
return 'String'
def adapt(self, base_model: Any)-> Model:
""" Base implementation just returns the base_model
"""
return base_model
class FlaskRestPlusPydanticAdapter(SchemaAdapter):
"""
Adapter for flask rest plus pydantic
:param api flask_restplus.Api: flask_restplus Api instance to which is needed for \
api.model function call
"""
def adapt(self, base_model: BaseModel) -> Model:
"""
converts Pydantic model to flask rest plus model
:param base_model pydantic.BaseModel: Pydantic base model the will be
converted to use flask restplus Model
:return: Model instance
:rtype: flask_restplus.Model
"""
result = {}
entity_name = base_model.__model__ if hasattr(base_model, '__model__') else \
re.sub(r'(?<!^)(?=[A-Z])', '_', base_model.__name__).lower()
for name, python_type in base_model.__annotations__.items():
if '__' in name: #skip the python methods
continue
regex = None
description = ""
required = True
field_data = dict(base_model.__fields__.items())[name]
if field_data is not None and hasattr(field_data, 'field_info'):
regex = field_data.field_info.regex
description = field_data.field_info.description
required = field_data.required
# TODO implement all field attributes: idea make a dict of attributes
# pass down using the **attributes vs makeing variables for each
# union type includes Optional which is Union[type,None]
if hasattr(python_type, '__origin__') and python_type.__origin__ == Union:
args = list(python_type.__args__)
if type(None) in args:
required = False
args.remove(type(None))
python_type = args[0]
# List logic
if hasattr(python_type, '__origin__') and python_type.__origin__ in [List, list]:
args = list(python_type.__args__)
current_type = self.python_to_flask(args[0])
result[name] = fields.List(getattr(fields, current_type)(
readOnly=False, description=description, required=required, pattern=regex))
continue
# Nested classes
if hasattr(python_type, '__bases__') and BaseModel in python_type.__bases__:
result[name] = fields.Nested(self.pydantic_model(python_type))
continue
current_type = self.python_to_flask(self.python_to_flask)
result[name] = getattr(fields, current_type)(
readOnly=False, description=description, required=required, pattern=regex)
return self.api.model(entity_name, result)
From - https://raw.githubusercontent.com/bluemner/flask_restplus_pydantic/master/flask_restplus_pydantic/adapter.py
Hi @SteadBytes I find very interesting your proposal. I think new features would be antifragile if they are:
- Backwards compatible (if possible)
- Extend new funcionalities
- Integrate work from others: modularity
The third point is important as not to reinvent the wheel. That point has been used by sucessful modules like PIL (coupled with numpy) and Pandas (with numpy, sqlalchemy, xlsx modules...).
In fact, one of the most important features of restx for me is that I can add restx as a blueprint and have it as an extra module of my flask app.
Today I discovered a module, forked from restplus that allows the integration of marshmallow and webargs (also hosted by marsmallow) would be great to be integrated in a future version of restx
.
Hi everyone, did anyone have a way to adding pydantic validation and doc generation to restx? Does @bluemner 's adapter work with flask-restx? Anyone have an example?
Thanks!
@conradogarciaberrotaran not to say pydantic is incompatible with flask-restx, but it represents a totally different approach (as @bluemner shows above). Work on this specific redesign has halted recently as the current set of maintainers moved into roles where flask-restx has not been part of their day to day work.
One thing, if you're looking for a platform now, there are several frameworks and plugins geared specifically toward pydantic. There's the Flask-Pydantic framework which seems geared toward the validation side. There is also a fairly young project called flask-pydantic-spec specifically for generating OpenAPI specs
Hopefully we'll be able to bring on more people, and help move this design work forward.
What is the main use of the requests parser? How are people using it now? What do they want out of it? (I'm going to be honest, I never used it)
I have to say that I use that a lot and it is a critical feature of flask-restx for me. I am not interested in models etc.
The Reqparser allows to quickly specify a no-nonsense input spec and handle the validation/conversion.
You can find an example usage there tshistory_rest.
Guys i think we have to accept it now, the flask community loves Marshmallow and hence we have to choose one such flask extension. So, I think APIFairy it is, miguel built it, so it has to be stable we can start using it and create PRs for features that we think we need. IMO