cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Port syslog module to use module state

Open corona10 opened this issue 3 years ago • 9 comments

While I check the @ericsnowcurrently 's checklist: https://github.com/ericsnowcurrently/multi-core-python/wiki/0-The-Plan I found that the syslog module still uses the global variable for the following variables.

  • S_ident_o
  • S_log_open

Both variables are declared as global variables since the original author assumes that in only one instance, only one syslog will exist.(one process one interpreter)

So I think that it's okay to move them into the module state if we consider the per-interpreter model.

  • PR: gh-99128

corona10 avatar Nov 05 '22 13:11 corona10

Hmm, I think that I miss understood about syslog, I close ths issue as invalid.

corona10 avatar Nov 05 '22 13:11 corona10

I've added some review comments to the PR, it seems to be on right path with some minor issues because the syslog extension changes process global state in the syslog library. Those issues IMHO can be easily fixed.

ronaldoussoren avatar Nov 05 '22 13:11 ronaldoussoren

I've added some review comments to the PR, it seems to be on right path with some minor issues because the syslog extension changes process global state in the syslog library. Those issues IMHO can be easily fixed.

Thanks I will take a look.

corona10 avatar Nov 05 '22 14:11 corona10

Note that currently the syslog module already has the following characteristics:

  • an app that embeds CPython may call openlog() independently of any use of the syslog module, and the two may interfere with each other
  • two interpreters using the syslog module may interfere with each other in the same way

We can't do much about the embedding case because the posix syslog API doesn't provide a way to get the current configuration (if any) such that you could restore it.

For multiple interpreters, either we leave the current behavior or we try to make each interpreter independent. To do that, we have two options:

  • swap in each interpreter's configuration around each syslog.syslog() call
    • call openlog() before and closelog() after
    • call setlogmask() before and after
    • syslog.openlog() and syslog.setlogmask() would only store the values on the module state
  • do similar, but be smarter about the single-interpreter case and cases where swapping is not necessary (i.e. same syslog config)

The "smarter" approach would require that the syslog module keep process-global state, along with a lock to protect that state. That would require something similar to what is discussed in https://discuss.python.org/t/20668 (and https://discuss.python.org/t/20663).


For reference:

  • https://pubs.opengroup.org/onlinepubs/009695399/functions/openlog.html
  • https://www.gnu.org/software/libc/manual/html_node/openlog.html
  • https://stackoverflow.com/questions/12987180
  • https://sourceware.org/git/?p=glibc.git;a=blob;f=misc/syslog.c;h=f67d4b58a448cabdf63f629f72a9df6e009286cf;hb=refs/heads/master
  • https://ruby-doc.org/stdlib-2.7.1/libdoc/syslog/rdoc/Syslog.html
  • https://docs.python.org/3/library/syslog.html

ericsnowcurrently avatar Nov 07 '22 19:11 ericsnowcurrently

FWIW, for an implementation of mutli-phase init for a module, I would expect it to either deal with the sort of weird behavior syslog would have, or disallow use in multiple interpreters (at least in problematic cases).

ericsnowcurrently avatar Nov 07 '22 20:11 ericsnowcurrently

We can't do much about the embedding case because the posix syslog API doesn't provide a way to get the current configuration (if any) such that you could restore it.

This is a typical example of a library with process-wide state. It's meant to be used by “the” application, rather than “a” library. I don't think Python's stdlib should go out of this way to adapt the API to multiple libraries that aren't aware of each other. (On the other hand, Python's C API provide the tools to allow third-parties to do that.)

So I'd do straightforward locking that's useful for applications, but if misused (used from a library) you get errors rather than unintended behavior. Specifically:

Calls to openlog, closelog and setlogmask can only be done in one module. After one of them is called, all will fail when called from a different module object. The restriction is reset when the module is garbage-collected. (Typically, you only get different module objects in different interpreters – unless you do del sys.modules['syslog'], which is documented to have weird effects. So, user-facing docs can explain things in terms of interpreters rather than module objects.)

That can always be relaxed if we find “obviously good” semantics later. With more complicated solutions like swapping the state around, we could paint ourselves in a corner. Unlike a third-party library, we can't easily drop compatibility and allow people to pin <2.0.

(Anyway, even for this “simple” locking, to support multiple GILs we'd need a process-global lock.)

encukou avatar Nov 08 '22 09:11 encukou

This is a typical example of a library with process-wide state. It's meant to be used by “the” application, rather than “a” library.

My position is that process-wide state belongs to the main interpreter. That's where the app lives.

I don't think Python's stdlib should go out of this way to adapt the API to multiple libraries that aren't aware of each other.

+1

With more complicated solutions like swapping the state around, we could paint ourselves in a corner.

Agreed.

Furthermore, now that I think about it, the simplest solution is to disallow syslog.openlog() and syslog.closelog() (and syslog.setlogmask()) in subinterpreters (ergo, only allowed in the main interpreter). Then there's no need for process-wide state or a global lock. As you said, "that can always be relaxed if we find “obviously good” semantics later."

ericsnowcurrently avatar Nov 08 '22 16:11 ericsnowcurrently

@ericsnowcurrently Do we have a public API to check whether the current interpreter is main interpreter? Currently, we can check the status if the module is a core module but syslog module is not core module, so such APIs will be needed for 3rd party authors.

corona10 avatar Nov 10 '22 03:11 corona10

Do we have a public API to check whether the current interpreter is main interpreter?

We have PyInterpreterState_Main(), so you can do current_interp == PyInterpreterState_Main(). We also have an internal _Py_IsMainInterpreter(interp), in case that would be worth making public.

ericsnowcurrently avatar Nov 10 '22 18:11 ericsnowcurrently

We might need to do a similar trick for the readline module.

erlend-aasland avatar Nov 30 '22 12:11 erlend-aasland