mod_wsgi icon indicating copy to clipboard operation
mod_wsgi copied to clipboard

Apache 2.4 crash when using pydantic

Open asaff1 opened this issue 1 year ago • 8 comments

I believe the issue is with mod wsgi. It happens ONLY when using mod wsgi. When using django runserver it works fine.

See the issue: https://github.com/pydantic/pydantic/issues/11053 Running on Windows. Not sure if it happens on linux as well.

How can I debug this issue? No logs in apache at all, I only see the server crashes.

UPDATE

  • Issue is Windows only
  • Reproduction: https://github.com/asaff1/pydantic_mod_wsgi_bug

asaff1 avatar Dec 06 '24 13:12 asaff1

You don't provide what mod_wsgi configuration you are using, nor a full list of what Python packages your application uses.

This issue may not be directly caused by pydantic, but indirectly by some other Python package that pydantic depends on, or even some Python package you use elsewhere in your application.

Specifically, the problem may be that some Python package being used is not designed and implemented properly to work in a Python sub interpreter. When this is the case Python packages can hang due to internal Python threading deadlocks or other reasons.

Sub interpreters become an issue because by default mod_wsgi will use a Python sub interpreter context to run your application rather than the main interpreter context.

Look through your mod_wsgi configuration and if you don't have:

WSGIApplicationGroup %{GLOBAL}

then you are relying on default behaviour.

So if not present, try adding that directive into the mod_wsgi configuration in Apache, which will force the use of the main Python interpreter context instead.

See also:

  • https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api

for more details.

GrahamDumpleton avatar Dec 06 '24 20:12 GrahamDumpleton

Hi @GrahamDumpleton Look in the issue I linked, in the comment there I made a minimal application in a repo that reproduces the issue:

https://github.com/asaff1/pydantic_mod_wsgi_bug

You can see the apache config, packages, etc. Please try it and see.

You can see there I already use the WSGIApplicationGroup %{GLOBAL} (I use mod wsgi for quite some time. I had to add that because of numpy like packages, as you said in the docs).

I could not find any cause how why that happen, and I dont see any log from apache.

asaff1 avatar Dec 06 '24 21:12 asaff1

For the versions of pydantic you think it started with, you are going to have to work out what versions of pydantic-core started to be used. This issue is likely going to be related to the Python extension implemented in Rust. Something about how it works is probably not compatible with memory models in an embedded system like mod_wsgi inside of Apache.

GrahamDumpleton avatar Dec 06 '24 23:12 GrahamDumpleton

@GrahamDumpleton Did you reproduce the issue with my repo? I wonder if it happens on linux as well. I also thought that the rust extension is likely the cause.

Can you elaborate a bit what can cause this and how to debug it?

Pydantic is a very popular library today. So I think this issue is going to affect many users

asaff1 avatar Dec 07 '24 13:12 asaff1

I have not had any time to look at it as prepping for a trip. On that trip I will have no computer access so I will not be able to look at it for a while. So if I don't get a chance to look at it, it will need to wait.

Since the pydantic versions you believe started triggering the issue have been available for up to a year, seems like you are only the person affected. Your use of such an old Python version could be related. Can you update to a newer Python version?

Any Python extension which doesn't get reference counting correct on Python objects can cause subtle issues which may only show up when the memory layout of objects and how free memory is reused results in it occurring. So it may only in your specific environment and be hard to reproduce. Anyway, that is assuming it may be a reference counting. I know nothing about pydantic and what it is doing behind the scenes so cannot really judge. That they use Rust for implement the Python extension will make things even harder to sort out.

GrahamDumpleton avatar Dec 08 '24 03:12 GrahamDumpleton

@GrahamDumpleton Thanks, honestly I don't have a clue yet. But I have some more details.

  • I've tested on Windows, Python 3.11 and mod-wsgi==5.0.2 -> the issue still happens.

  • I've did another test with linux OS: I used docker (under Windows), python:3.11 image, used mod_wsgi-express start-server to start -> issue does not exists!

  • For completeness, I also tried with mod_wsgi-express start-server on Windows (I know it is not well supported - never actually used it in production) => Issue exists.

So for now I can say it only affecting Windows machines. (I've tested on two different Windows 10 machines) What can be the difference between mod-wsgi on linux / Windows that will cause the bug?

asaff1 avatar Dec 08 '24 18:12 asaff1

Hi @GrahamDumpleton Any chance you have you tested the issue? I've made a really good reporoduction in the repo in the description.

asaff1 avatar Jan 29 '25 10:01 asaff1

No I haven't since after getting back from my extended trip I have started at a new job and have had little spare time to do anything.

I also don't have a usable Windows system to investigate at the moment.

GrahamDumpleton avatar Jan 29 '25 10:01 GrahamDumpleton