frida-python icon indicating copy to clipboard operation
frida-python copied to clipboard

Process hangs after app spawn

Open polakovicp opened this issue 5 years ago • 10 comments

Hello, I am experiencing problem with spawning Android app with Frida 11.0.0. I have multiple Android VMs managed from one python process using asyncio. Following coroutine code spawns application inside VM:

            for device in device_manager.enumerate_devices():
                if device.id == ip_port:
                    break
            else:
                device = frida.get_usb_device(timeout=5)
            pid = device.spawn([app_name]) # <= sometimes never returns

Problem is that sometimes device.spawn(...) does not return. And since it's inside coroutine, every task inside loop is waiting. Can it be some kind of deadlock since I have multiple devices inside process? Is it normal for spawn() to get stuck? Thanks

polakovicp avatar Mar 07 '19 19:03 polakovicp

I am having the same issue, when spawning multiple times in a loop. At some point (after around 1000 successful spawns, which each were exited before the next loop) the spawn() function starts hanging.

nksCH avatar Jun 15 '19 12:06 nksCH

@nksCH Which platform and what are you spawning?

oleavr avatar Jun 15 '19 12:06 oleavr

I had the same problem when I started development of passionfruit, then I just gave up and move to node...

ChiChou avatar Jun 16 '19 13:06 ChiChou

Sorry, my bug report was non optimal.

I am running frida on 64-bit Kali Linux instance (so, the platform in this case is not Android). My setup works as following:

  • I have a main loop, and 4 workers (each instantiate frida on their own)
  • After around 1000 spawns (numbers vary), inside my workers every new spawn triggers an exception "timeout was reached"
  • After killing the main thread and re-launching the whole process, it works again.

I see that, e.g. frida.get_local_device() have a timeout set. But spawn() does not support timeouts. I tried to identify other sources of fail, for example exhausted file/process handles - but this is not the case.

Additionally I created a trace (using python trace [-mtrace --trace]):

 --- modulename: __init__, funcname: <lambda>
__init__.py(75):     return get_device_matching(lambda device: device.type == 'local', timeout=0)
__init__.py(104):         initial_matches = [device for device in manager.enumerate_devices() if predicate(device)]
__init__.py(105):         if len(initial_matches) > 0:
__init__.py(106):             return initial_matches[0]
__init__.py(119):         manager.off('added', on_device_added)
 --- modulename: core, funcname: off
core.py(43):         self._impl.off(signal, callback)
 --- modulename: core, funcname: spawn
core.py(90):         if not isinstance(program, string_types):
core.py(91):             argv = program
core.py(92):             program = argv[0]
core.py(93):             if len(argv) == 1:
core.py(96):         aux_options = kwargs
core.py(98):         return self._impl.spawn(program, argv, envp, env, cwd, stdio, aux_options)

Hope that helps.

nksCH avatar Jun 17 '19 07:06 nksCH

I had the same problem when I started development of passionfruit, then I just gave up and move to node...

I have the same problem... Sometimes works, sometimes fails. It's ok for manual work, but I do a serious project like ChiChou does

vadimszzz avatar Apr 21 '22 12:04 vadimszzz

@vadimszzz Without seeing your code it's impossible to say where the issue is, but in general: watch out for callbacks from Frida. If you do any blocking work in any of them, i.e. you don't immediately let the underlying native thread return to Frida's internal event loop, you will be starving Frida's I/O and this will result in anything from timeouts (at best) to deadlocks (at worst). This is because our current Python bindings use our synchronous APIs. Once migrated to async/await this kind of pitfalls will finally become a thing of the past. In the meantime I would second @ChiChou's suggestion on using our Node.js bindings, as those are fully asynchronous (they don't use our synchronous APIs). If you want to help out with migrating frida-python to async/await though please get in touch, we could definitely use more hands on deck 🙂

oleavr avatar Apr 21 '22 16:04 oleavr

(That said, if you do your threading correctly you will be fine with the current Python bindings too — they just make it very easy to shoot yourself in the foot without realizing.)

oleavr avatar Apr 21 '22 16:04 oleavr

@nksCH Thanks! It sounds like the issue may be in the Linux backend, perhaps some issue with how we handle ptrace() and waitpid(). The good news is that this code is vastly simpler than the iOS one that @vadimszzz is likely referring to. (Assuming your callbacks from Frida are well-behaved and return right away.) Does it matter what the target program is or can you reproduce this with /bin/cat or some simple program?

oleavr avatar Apr 21 '22 16:04 oleavr

@oleavr No, I have absolutely the same issue:

frida_device.spawn("Application") <- stuck here sometimes without return

And a lot of frida.TimedOutErrors in other places.

vadimszzz avatar Apr 26 '22 14:04 vadimszzz

bump

neomafo88 avatar Apr 12 '23 08:04 neomafo88