wakepy
wakepy copied to clipboard
wakepy freezes ubuntu when using it repeatedly (10k times) in a loop on Ubuntu (Gnome SessionManager)
Found this one accidentally when debugging https://github.com/fohrloop/wakepy/issues/276. This will freeze Ubuntu 22.04, GNOME 42.9, development version of 0.8.0 wakepy (for example https://github.com/fohrloop/wakepy/commit/c574e529ea01fdeb4dbb4469bf9d1b84841af730):
from wakepy import keep
for i in range(10_000):
with keep.running() as m:
print(i)
About the behavior:
- It's not specific to wakepy. The same behavior can be reproduced by using just jeepney or dbus.
- If it does not freeze the system, it will make mouse cursor laggy. This lag is removed after logout -> login.
- Adding delay between function calls fixes the problem. For example, using 50ms delay and 80k iterations will not reproduce the issue. See this comment.
- Same behavior on Ubuntu with org.freedesktop.ScreenSaver (also on wakepy 0.7.x)
- This does not occur on Fedora 40 (Gnome 46 + dbus-broker), so either updating Gnome from 42.9 to 46 or changing dbus-daemon to dbus-broker probably fixes the broblem. (Thanks @takluyver)
Task:
- Find out what is causing the freeze and fix it
Debugging: bottom-up
Trying to reproduce the behavior with smallest amount of code possible.
1) It's not caused by just opening and closing connections
This causes no problems:
from jeepney.io.blocking import open_dbus_connection
for i in range(10_000):
print(i)
connection = open_dbus_connection(bus="SESSION")
connection.close()
2) A simple DBus method call does not cause the issue
This causes no problems:
from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg
for i in range(10_000):
print(i)
connection = open_dbus_connection(bus="SESSION")
addr = DBusAddress(
object_path="/org/freedesktop/DBus",
bus_name="org.freedesktop.DBus",
interface="org.freedesktop.DBus",
)
msg = new_method_call(
addr,
method="ListNames",
signature=None,
body=None,
)
reply = connection.send_and_get_reply(msg, timeout=1)
resp = unwrap_msg(reply)
connection.close()
3) Inhibit + Uninhibit DBus calls to SessionManager
This will freeze my Ubuntu 22.04 laptop with GNOME 42.9 in around 400-600 iterations:
from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg
addr = DBusAddress(
object_path="/org/gnome/SessionManager",
bus_name="org.gnome.SessionManager",
interface="org.gnome.SessionManager",
)
def call(method, signature=None, body=None):
msg = new_method_call(
addr,
method=method,
signature=signature,
body=body,
)
reply = connection.send_and_get_reply(msg, timeout=1)
return unwrap_msg(reply)
for i in range(10_000):
connection = open_dbus_connection(bus="SESSION")
cookie = call(
method="Inhibit",
signature="susu", # "app_id", "toplevel_xid", "reason", "flags"
body=("wakepy", 42, "wakelock active", 8),
)[0]
inhibitors = call(
method="GetInhibitors",
)[0]
print(f"{i}, {cookie=}, inhibitors = {len(inhibitors)}")
call(
method="Uninhibit",
signature="u", # "cookie"
body=(cookie,),
)
inhibitors = call(
method="GetInhibitors",
)[0]
print(f" {i}, {cookie=}, inhibitors = {len(inhibitors)}")
connection.close()
I can see that the number of inhibitors is either 1 or 2, so these are not piling up:
0, cookie=618529695, inhibitors = 2
0, cookie=618529695, inhibitors = 1
1, cookie=1846700224, inhibitors = 2
1, cookie=1846700224, inhibitors = 1
Variations:
- Add
gc.collect()afterconnection.close()to get into about ~2000 iterations before mouse starts to feel laggy. But system will not freeze even with 10k iterations. After finishing, mouse is permanently laggy until relogin. - Using
random.randint(1, 100_000)in place of 42 as toplevel_xid has no effect.
4) Test with dbus library instead of jeepney
This also freezes the system in 400-600 cycles:
import dbus
for i in range(10_000):
session_bus = dbus.SessionBus()
proxy = session_bus.get_object(
"org.gnome.SessionManager", "/org/gnome/SessionManager"
)
cookie = proxy.get_dbus_method(
"Inhibit", dbus_interface="org.gnome.SessionManager"
)("wakepy", 42, "wakelock active", 8)
inhibitors = proxy.get_dbus_method(
"GetInhibitors", dbus_interface="org.gnome.SessionManager"
)()
print(f"{i}, {cookie=}, inhibitors = {len(inhibitors)}")
proxy.get_dbus_method("Uninhibit", dbus_interface="org.gnome.SessionManager")(
int(cookie)
)
inhibitors = proxy.get_dbus_method(
"GetInhibitors", dbus_interface="org.gnome.SessionManager"
)()
print(f" {i}, {cookie=}, inhibitors = {len(inhibitors)}")
session_bus.close()
Summary
It looks like it's not caused by jeepney since jeepney can process other dbus calls (ListNames example above) just fine. This also is reproduced with dbus-python, so it looks like it's connected to GNOME. I created an issue in the GNOME GitLab repository: gnome-settings-daemon/#798. I'll wait comments there before proceeding with this.
~I just figured out that this does not seem to cause any slow-downs~: EDIT: No. it does. It's just a bit random with all the methods that how many cycles they need until there's slow-down and freeze.
Using con.close() + gc.collect() and just jeepney
Note: removing the gc.collect() will make freeze the system around 2000 iterations.
import gc
from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg
addr = DBusAddress(
object_path="/org/gnome/SessionManager",
bus_name="org.gnome.SessionManager",
interface="org.gnome.SessionManager",
)
def call(method, signature=None, body=None):
msg = new_method_call(
addr,
method=method,
signature=signature,
body=body,
)
reply = connection.send_and_get_reply(msg, timeout=1)
return unwrap_msg(reply)
for i in range(10_000):
connection = open_dbus_connection(bus="SESSION")
cookie = call(
method="Inhibit",
signature="susu", # "app_id", "toplevel_xid", "reason", "flags"
body=("wakepy", 42, "wakelock active", 8),
)[0]
inhibitors = call(
method="GetInhibitors",
)[0]
print(f"{i}, {cookie=}, inhibitors = {len(inhibitors)}")
call(
method="Uninhibit",
signature="u", # "cookie"
body=(cookie,),
)
inhibitors = call(
method="GetInhibitors",
)[0]
print(f" {i}, {cookie=}, inhibitors = {len(inhibitors)}")
connection.close()
gc.collect()
so why does:
from wakepy import keep
for i in range(10_000):
with keep.running() as m:
print(i)
~still after #282 cause mouse to be laggy/choppy and require a relogin, since what they both should do is (1) close connection (2) gc.collect()? Furthermore, since the first one works without issues (after the gc.collect()), I'm starting to suspect that it is not a problem in Gnome but somewhere in the python world.~
Just tested on the same system and also the "org.freedesktop.ScreenSaver" (keep.presenting mode) behaves exactly the same. It's likely that the org.freedesktop.ScreenSaver interface is registered by some Gnome process when running Gnome, so this is somewhat expected. While this is an annoying bug, I don't know if it's wakepy, jeepney, dbus, CPython or Gnome (or something else) where the fix could and should be made. It's also been there already in wakepy 0.7.x (using org.freedesktop.ScreenSaver), so I think it's okay to proceed with releasing 0.8.0 with this. Luckily it's relatively uncommon to inhibit sleep hundreds or thousands times. I hope to gather more understanding on this and fix it or find an alternative way for inhibiting sleep on Gnome.
Adding delay between calls
Adding delay between function calls seems to fix the problem. I left my computer to do this (with wakepy on) for few hours:
code with four sleeps in it
import time
from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg
addr = DBusAddress(
object_path="/org/gnome/SessionManager",
bus_name="org.gnome.SessionManager",
interface="org.gnome.SessionManager",
)
def call(method, signature=None, body=None):
msg = new_method_call(
addr,
method=method,
signature=signature,
body=body,
)
reply = connection.send_and_get_reply(msg, timeout=1)
return unwrap_msg(reply)
for i in range(10_000):
connection = open_dbus_connection(bus="SESSION")
time.sleep(0.05) #1
cookie = call(
method="Inhibit",
signature="susu", # "app_id", "toplevel_xid", "reason", "flags"
body=("wakepy", 42, "wakelock active", 8),
)[0]
time.sleep(0.2) #2
print(f"{i}, {cookie=}")
call(
method="Uninhibit",
signature="u", # "cookie"
body=(cookie,),
)
time.sleep(0.5) #3
connection.close()
time.sleep(0.5) #4
When I came back, system was not suffering of any kind of slowness. So most likely this is about doing the function calls fast and repeatedly. The fact that calling gc.collect() did help a bit in the problem supports this, as calling it takes some time.
Narrowing down the problem
Here are some results when removing one of the sleeps:
single sleep removed
- When sleep #1 is removed, I could not see any slow-downs (29k iterations).
- When sleep #2 is removed, I could not see any slow-downs (31k iterations).
- When sleep #3 is removed, I could not see any slow-downs (20k iterations).
- When sleep #4 is removed, I could not see any slow-downs (20k iterations).
two sleeps removed
- When sleep #1 and #2 are removed, I could not see any slow-downs (30k iterations).
- When sleep #1 and #3 are removed, I could not see any slow-downs (30k iterations).
- When sleep #1 and #4 are removed, I could not see any slow-downs (30k iterations).
- When sleep #2 and #3 are removed, I could not see any slow-downs (30k iterations).
- When sleep #2 and #4 are removed, I could not see any slow-downs (30k iterations).
- When sleep #3 and #4 are removed, I could not see any slow-downs (30k iterations).
three sleeps removed
- When only sleep #1 is used, I could not see any slow-downs (30k iterations).
- When only sleep #2 is used, I could not see any slow-downs (30k iterations).
- When only sleep #3 is used, I could not see any slow-downs (30k iterations).
- When only sleep #4 is used, I could not see any slow-downs (30k iterations).
Summary
Adding sleep anywhere within the full cycle (open connection, inhibit sleep, uninhibit, close connection) seems to fix the problem. Shortest sleep tried was 50ms. In addition to this, I tested with wakepy with 50ms sleep and 80k iterations:
for i in range(80_000):
with keep.running() as m:
print(i, m.used_method)
time.sleep(0.05)
and could not reproduce any slow-downs. This seems to be a non-issue as the sleep is typically inhibited for a long-running task, and it's typical to have only one or few locks with one task.