django-ninja icon indicating copy to clipboard operation
django-ninja copied to clipboard

Cache the OpenAPISchema for faster responses

Open shmulvad opened this issue 1 year ago • 7 comments

Is your feature request related to a problem? Please describe.

I have a medium sized API in terms of number of endpoints and models (about 100 endpoints and 100 models). Even just with this size, generating the OpenAPI spec JSON can take about 0.7 seconds on my local development machine due to all the parsing of Pydantic models. This is fairly slow, and also makes the /docs page slower as it relies on the JSON.

Describe the solution you'd like I tried to modify the Django Ninja source code in the following manner so the OpenAPI schema is cached:

  1. In the main NinjaAPI class under __init__, add an instance variable self.openapi_schema: Optional[OpenAPISchema] = None.
  2. Change the method get_openapi_schema to the following:
def get_openapi_schema(
    self,
    *,
    path_prefix: Optional[str] = None,
    path_params: Optional[DictStrAny] = None,
) -> OpenAPISchema:
    if self.openapi_schema is not None:
        return self.openapi_schema
    if path_prefix is None:
        path_prefix = self.get_root_path(path_params or {})
    self.openapi_schema = get_schema(api=self, path_prefix=path_prefix)
    return self.openapi_schema

This cuts my load time down to <0.1 seconds which is a major win. It should be safe to cache the value as the API schema should not change during runtime.

If you are interested, I can create a PR for this.

shmulvad avatar Jul 24 '24 17:07 shmulvad

How does it affect memory usage?

baseplate-admin avatar Jul 24 '24 18:07 baseplate-admin

Hi @shmulvad

yeah, if you can measure your case with and without cache memory usage - would be nice to know that number (as if it is questionable - maybe it's better to cache raw output of docs page)

I'm not a fan of adding extra property(openapi_schema) to NinjaAPI (maybe functools.cache decorator would work as just single liner)

vitalik avatar Jul 24 '24 22:07 vitalik

Sure, functools.cache could be applied on this function instead. It should accomplish the same thing:

https://github.com/vitalik/django-ninja/blob/eecb05f8fbf147d8012072007bf18ce5abcfd420/ninja/openapi/schema.py#L27-L29

Regarding me measuring the memory usage with/without the cache, I can do it and report back the bytes used by the caching this in my particular case, but I don't see how it is useful information in general? It would be directly correlated with the size of the generated JSON, thus be highly dependent on how many routes/models/etc. have been defined. I think in an ordinary application, the memory used by the OpenAPISchema would be extremely negligible compared to the total application.

shmulvad avatar Jul 25 '24 04:07 shmulvad

Regarding me measuring the memory usage with/without the cache, I can do it and report back the bytes used by the caching this in my particular case, but I don't see how it is useful information in general?

The thing we want to learn is if caching something will affect performance at scale.

Think a large company's website (youtube.com/instagram.com) where every bytes of allocation has an impact on overall performance and responsiveness of the webapp in general.

It would be directly correlated with the size of the generated JSON, thus be highly dependent on how many routes/models/etc. have been defined.

I understand this, could you please share some metrics from your app in particular?


It would be directly correlated with the size of the

Side note: This part scares me. I am thinking of deploying a large openapi schema on a relatively low scale machine (think 5$ linode machine). How will that affect the performance?

baseplate-admin avatar Jul 25 '24 05:07 baseplate-admin

On my site, this JSON is about 215 KB. This is miniscule in comparison to how much memory my application consumes in total.

I really don't think memory would be an issue by caching this, but if you think so, may I suggest you let it be a setting then? In my case, I would much rather cache ~200 KB of data than I would wait an additional 0.6 s on every OpenAPISchema request generating the schema.

shmulvad avatar Jul 25 '24 07:07 shmulvad

may I suggest you let it be a setting then

This approach is better.

than I would wait an additional 0.6 s on every OpenAPISchema request generating the schema.

One question: Does DEBUG=False in settings.py make it faster?

baseplate-admin avatar Jul 25 '24 07:07 baseplate-admin

DEBUG=False does not really impact the load time.

shmulvad avatar Jul 25 '24 14:07 shmulvad