unit
unit copied to clipboard
Feature Request: Support Separating `SCRIPT_NAME` and `PATH_INFO` in Python
I am requesting that Python module be updated to accept an optional script-name
variable that is passed to WSGI applications in the SCRIPT_NAME
environment variable, and that the PATH_INFO
environment variable be updated to not start with the script-name
path.
Currently, WSGI applications in Nginx Unit are passed the full request path in the PATH_INFO
variable, even if the application can only be accessed on non-root paths. This differs from the behavior of Apache's mod_wsgi
, which puts part of the path in SCRIPT_NAME
and the remaining part in PATH_INFO
.
The behavior of mod_wsgi
means that a Django application only needs to be designed to handle its own URL routes, such as /admin/
, but can be accessed using Apache at paths such as /django-app/admin/
without any changes to the application.
Because Nginx Unit puts the full path in the PATH_INFO
variable, a Django application accessible at a non-root path such as /django
must be configured to strip /django
from the path before determining how to route the request, and if Unit is reconfigured to place the application under /app
, the Django application would also need to be updated.
Do you know how it's implemented in Gunicorn?
Because Nginx Unit puts the full path in the PATH_INFO variable, a Django application accessible at a non-root path such as /django must be configured to strip /django from the path before determining how to route the request, and if Unit is reconfigured to place the application under /app, the Django application would also need to be updated.
Note also, that you can avoid the need of updating app sources in this case by providing a value to Django using the environment variable configured in environment
option in Unit: https://unit.nginx.org/configuration/#configuration-apps-common
Do you know how it's implemented in Gunicorn?
I haven't seen anything in the documentation or tutorials about having a single Gunicorn instance serve multiple applications at non-root paths.
Because Nginx Unit puts the full path in the PATH_INFO variable, a Django application accessible at a non-root path such as /django must be configured to strip /django from the path before determining how to route the request, and if Unit is reconfigured to place the application under /app, the Django application would also need to be updated.
Note also, that you can avoid the need of updating app sources in this case by providing a value to Django using the environment variable configured in
environment
option in Unit: https://unit.nginx.org/configuration/#configuration-apps-common
This is true, but currently the environment
option only affects system environment variables, not WSGI environment variables. This means that any Django application that currently works with Apache and mod_wsgi
would need to be updated to:
- Check
os.environ
for the script name - Call Django's
set_script_prefix
to set the script name globally - Strip the script name from the beginning of the WSGI environment's
PATH_INFO
so the request routes to the proper view - Continue on as normal
Implementing my request (or something similar) would allow applications to be run under both Nginx Unit and Apache with no modifications.
Implementing my request (or something similar) would allow applications to be run under both Nginx Unit and Apache with no modifications.
Sure, understood.
But what I dislike with SCRIPT_NAME
approach is that it adds additional WSGI var, that has to be passed with every request, while basically in most cases it's just constant and can be set once on the initialization stage of the app.
Implementing my request (or something similar) would allow applications to be run under both Nginx Unit and Apache with no modifications.
Sure, understood.
But what I dislike with
SCRIPT_NAME
approach is that it adds additional WSGI var, that has to be passed with every request, while basically in most cases it's just constant and can be set once on the initialization stage of the app.
Django checks for the SCRIPT_NAME
variable on every request, whether the WSGI variable is set or not. My bigger concern is stripping the script prefix from the path so Django can route it properly. This will have to be done either at the server level or at the application level, and I think doing it at the server level improves portability.
For context, I am part of a team that develops applications for internal use at my organization. These applications are developed using Django and are served by a single Apache Web Server instance using mod_wsgi
.
When developing a Django application, a development server is used and the application is served from the root path of localhost
, but the application is served from a non-root path in production using the WSGIScriptAlias
directive, which mod_wsgi
uses to set the SCRIPT_NAME
and removes it from the start of PATH_INFO
so the Django application doesn't have to do it.
My team considered migrating from Apache to Unit because it would make migrating projects to newer Python versions easier, but the inability to host our applications without modification is preventing us from doing that.
If this feature (or something similar) is not added, I believe the Unit documentation should be updated to warn that Django applications cannot be served from non-root paths without modification.
We've implemented this with a subclass of WSGIHandler. Not ideal, as WSGIHandler is not suppose to be public...
import django from django.core.handlers.wsgi import WSGIHandler class TBWSGIHandler(WSGIHandler): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) def _set_script_name_path_info(self, environ): '''Set the script_name and path_info Before: PATH_INFO=/one/two/three After: SCRIPT_NAME=/one PATH_INFO=/two/three ''' split_path = environ['PATH_INFO'].split('/') environ['SCRIPT_NAME'] = '/'.join(split_path[:2]) environ['PATH_INFO'] = '/'+'/'.join(split_path[2:]) def __call__(self, environ, start_response): self._set_script_name_path_info(environ) return super().__call__(environ, start_response) def get_tb_wsgi_application(): """ TB Shim around django wsgi handler """ django.setup(set_prefix=False) return TBWSGIHandler()
We've implemented this with a subclass of WSGIHandler. Not ideal, as WSGIHandler is not suppose to be public...
import django from django.core.handlers.wsgi import WSGIHandler class TBWSGIHandler(WSGIHandler): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) def _set_script_name_path_info(self, environ): '''Set the script_name and path_info Before: PATH_INFO=/one/two/three After: SCRIPT_NAME=/one PATH_INFO=/two/three ''' split_path = environ['PATH_INFO'].split('/') environ['SCRIPT_NAME'] = '/'.join(split_path[:2]) environ['PATH_INFO'] = '/'+'/'.join(split_path[2:]) def __call__(self, environ, start_response): self._set_script_name_path_info(environ) return super().__call__(environ, start_response) def get_tb_wsgi_application(): """ TB Shim around django wsgi handler """ django.setup(set_prefix=False) return TBWSGIHandler()
That would work for most of our applications, but this code only uses the first path segment as the script name, and we have some cases where we want the script name to consist of multiple path segments (ex. /hr/forms/
serves the static file /hr/forms/index.html
containing links to hiring and time-off forms, but /hr/forms/time-off/
and /hr/forms/hiring/
are both Django applications).
There are other workarounds we could use in those applications, and we may end up using them if we ultimately decide to migrate to Unit, but the ability to serve multiple Django applications from a single server (using different Python versions, even!) was a major reason we started investigating Unit in the first place.
The fact that multiple out-of-the-box Django applications cannot be served by a single server should at least be mentioned in the documentation so newcomers know about it from the beginning.