ratarmount icon indicating copy to clipboard operation
ratarmount copied to clipboard

Mount hangs when `cli` called with `asyncio.run_in_executor`

Open phlax opened this issue 3 years ago • 12 comments

I have been experimenting with ratarmount and mostly working nicely so far, thanks

A problem i have hit and struggling to figure out why is that when i call cli(args) directly from my code mounting works as expected

When i shift the same code to run under asyncio.run_in_executor the mountpoint is created but the program hangs as if its running in the foreground

When i control-C my escape hatch unmounts successfully as expected

Im wondering if there is some tty detection or similar going on in the fuse lib, and whether there is a way to force return/backgrounding

Possibly related #76

sys info:

fuse/stable,now 2.9.9-5 amd64 [installed]
#1 SMP Debian 5.10.113-1 (2022-04-29) GNU/Linux
ratarmount==0.11.3 
PYTHON_VERSION = "3.10.2"

phlax avatar Jul 07 '22 09:07 phlax

there doesnt appear to be any difference in the os.environ in/outside the executor so it doesnt appear to be TTY detection

phlax avatar Jul 07 '22 09:07 phlax

Care has to be taken because the daemonizing forks into the background and I have to join all threads, e.g., for the parallel bz2 decoder, before that fork and reopen it after it has forked. Aside from that, I'm out of ideas regarding problems when daemonizing :/

mxmlnkn avatar Jul 07 '22 11:07 mxmlnkn

i think it may be this https://github.com/libfuse/libfuse/issues/382

altho im somewhat confused as im using a ProcessPoolExecutor which should theoretically banish thread problems - either way i think it has something to do with signal handling

i have tried stepping through/into code with pdb and it hangs on the call to:

err = _libfuse.fuse_main_real(
...

in fusepy/fuse.py

phlax avatar Jul 07 '22 11:07 phlax

possibly related https://groups.google.com/g/comp.lang.python/c/tkS3VvyLD1M

phlax avatar Jul 07 '22 12:07 phlax

another ~related discussion https://groups.google.com/g/python-tulip/c/91NCCqV4SFs

this seems to resolve, altho im not clear at all of the implications:


import multiprocessing

multiprocessing.set_start_method('forkserver')

phlax avatar Jul 07 '22 12:07 phlax

actually i think it didnt work - it instead throws an error about A process in the process pool was terminated abruptly while the future was running or pending.

phlax avatar Jul 07 '22 12:07 phlax

i can workaround by avoiding using run_in_executor and just calling the tool so not a huge blocker for us, and it speeds things up considerably compared to copying out tarballs everywhere - so, thanks again

would be good to know what is going on, still 8/

phlax avatar Jul 07 '22 15:07 phlax

i can workaround by avoiding using run_in_executor and just calling the tool so not a huge blocker for us, and it speeds things up considerably compared to copying out tarballs everywhere - so, thanks again

would be good to know what is going on, still 8/

:) Good to hear that it can be worked around.

Currently, I'm a bit low on time especially as I want to prioritize the custom-written parallel random access gzip backend pragzip but I might take a closer look at it in the future especially as there is still that MacOS issue that you referenced open. Furthermore, I still have on my to-do list to port ratarmount to FUSE3, which would require using something other than fusepy. Maybe that also alleviates or worsens the problem.

mxmlnkn avatar Jul 07 '22 15:07 mxmlnkn

Maybe you could also post a minimal example so I can reproduce it easily for testing purposes?

mxmlnkn avatar Jul 07 '22 15:07 mxmlnkn

Maybe you could also post a minimal example so I can reproduce it easily for testing purposes?

yep, i was thinking similar - ill follow up on this tomorrow

it may even isolate the problem and make it obvious - but im out of tricks atm - i think somehow the forked proc is having its signals managed in a way that doesnt play with fuse but havent managed to debug anything more

phlax avatar Jul 07 '22 15:07 phlax

possibly related signals http://curiousthing.org/sigttin-sigttou-deep-dive-linux

ive seen a few mentions around mac and linux backgrounding issues <> SIGTTOU

its not clear whether this is the issue or how to resolve, but seems related

working on a minimal repro i realized a few things 8/

firstly it is sys.exiting if successful afaict - when i was thinking my script was successful it was just exiting after mounting the directory

not sure exactly how this is being triggered - i tried both catching SystemExit and adding atexit.register(fun) but neither catch the signal

the only time this doesnt happen is if there is failure, from which im deducing rightly/wrongly that something low-level in FUSE is bypassing the normal python lifecycle

anyhow, exit issues aside - the following code mounts in the background:


import pathlib
import tarfile
import tempfile
import ratarmount


def create_tarball(output):
    with tempfile.TemporaryDirectory() as tmp:
        pathlib.Path(tmp).joinpath("foo.txt").write_text("BAR")
        with tarfile.open(output, "w:gz") as tar:
            tar.add(tmp, arcname=".")
    return output


def mount_dir(tarball):
    ratarmount.cli((tarball, "tmpmount"))


tarball = create_tarball("baz.tar.gz")
mount_dir(tarball)

but the asyncio/executor version sits in the foreground without exiting:


import asyncio
import concurrent


async def mount_dir_async(tarball):
    asyncio.get_event_loop().run_in_executor(
        concurrent.futures.ProcessPoolExecutor(),
        mount_dir,
        tarball)


asyncio.run(mount_dir_async(tarball))

phlax avatar Jul 08 '22 09:07 phlax

Btw, I'm not sure if it applies to your use case but you could also avoid forking into the background by specifying --foreground and then keep that process or thread in the background yourself. That way it would also be correctly closed on exit.

You could also use ratarmount as a library but that might require more adaption in your code.

mxmlnkn avatar Jul 08 '22 15:07 mxmlnkn

Im wondering if there is some tty detection or similar going on in the fuse lib, and whether there is a way to force return/backgrounding

Note that your example code goes into background when using ThreadPoolExecutor instead of ProcessPoolExecutor.

The magic for daemonizing happens in this fork in libfuse. Afaik, it starts a daemonized child process and then the actual process simply finishes and quits. However, this does not explain why it does not work in the ProcessPoolExecutor...

I'm not sure what exactly your intentions were but if you want to mount it and use the mount point in the same script, then the previously recommended library would be the way to go.

If you want to use FUSE because another function or library wants an existing file system path, then I don't see the problem with the current behavior. The only missing link would be that you call ratarmount -u before exiting so that your python program does not hang:

import asyncio
import concurrent
import os
import time

def mount_dir(tarball):
    ratarmount.cli(("-f", tarball, "tmpmount"))

async def mount_dir_async(tarball):
    asyncio.get_event_loop().run_in_executor(
        concurrent.futures.ProcessPoolExecutor(),
        mount_dir,
        tarball)

async def do_something_with_the_mount_async():
    # We need to wait at least until the mount point got created and initialized
    # There might be a better way to do this...
    time.sleep(2)
    print("Contents of mount point:", os.listdir("tmpmount"))

async def runAsync():
    mountTask = asyncio.create_task(mount_dir_async(tarball))
    doTask = asyncio.create_task(do_something_with_the_mount_async())

    await doTask
    print("Unmounting now ...")
    ratarmount.cli(("-u", "tmpmount"))
    await mountTask

    print("Finished")

asyncio.run(runAsync())

Output:

fusermount: entry for tmpmount not found in /etc/mtab
Creating new SQLite index database at baz.tar.gz.index.sqlite
Creating offset dictionary for baz.tar.gz ...
Creating offset dictionary for baz.tar.gz took 0.00s
Writing out TAR index to baz.tar.gz.index.sqlite took 0s and is sized 24576 B
Contents of mount point: ['foo.txt']
Unmounting now ...
Finished

This does work somewhat even though it is clunky. I'm no expert on async usage in Python... That's why at times I get a weird error when it tries to exit this script:

exception calling callback for <Future at 0x7fc0eb816410 state=finished returned NoneType>
Traceback (most recent call last):
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
    callback(self)
  File "/usr/lib/python3.10/asyncio/futures.py", line 399, in _call_set_state
    dest_loop.call_soon_threadsafe(_set_state, destination, source)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 795, in call_soon_threadsafe
    self._check_closed()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

If you want your own program to go into background, then you still shouldn't depend on ratarmount on doing that because it probably is not possible in the first place. You could however, just like in the above example, start your own daemonized subprocess, which when starts ratarmount in a subsubprocess normally without daemonizing. Something like:

process = multiprocessing.Process(target=[runAsync](lambda: asyncio.run(runAsync())), daemon=True)
process.start()

I can't get it to work correctly right now.

Feel free to reopen this or another issue if you still have trouble but I don't see a fix in ratarmount for the daemonizing part and the "hang" should be fixed by calling fusermount -u or ratarmount -u.

mxmlnkn avatar Feb 20 '23 00:02 mxmlnkn