How should we use pyjulia in parallel?
I want to propose a discussion about how pyjulia should be used in parallel computing.
I recently found a way that seems rather stable:
- we should make sure that code is loaded only once per each process
- we should use
w+mode formemmapping
Here is a mini snippet with a decorator that could be added to pyjulia to ensure point 1.
For point 2, instead, joblib has an option in Parallel. Should we may explicit this in the documentation?
def julia_import(module: str, filename: str):
def decorator(fn):
def wrapper(*args, **kwargs):
# including stuffs
from julia import Main
if not hasattr(Main, module):
Main.include(filename)
return fn(Main, *args, **kwargs)
return wrapper
return decorator
@julia_import("MyModule", "lib.jl")
def python_function_using_julia(Main, *args, **kwargs):
return Main.MyModule.julia_function(*args, **kwargs)
res = Parallel(n_jobs=-1, mmap_mode="w+")(
delayed(python_function_using_julia)(arr) for arr in tqdm(data))
Example from: https://github.com/00sapo/pyjulia-vs-juliacall/blob/master/test.py
@00sapo Would this code work within a Python sub-process?
Are you talking about subprocess module or do you want to use parallel code in an already parallel code? In the first case, I don't see why you should. In the second case, avoid it as much as possible!
@00sapo I have a program that is spawned using the subprocess module from the main process. The program (i.e. each subprocess) calls PyJulia and diffeqpy py. However, I run into out ReadOnlyMemoryError() errors when the number of sub-process is greater than 50 (less than that it works fine). My question is whether your script could help in the stable creation of large number of Julia instances linked to the subprocess.
Have you tried to set memmap="w+"?
Il ven 10 dic 2021, 21:20 Siby Jose Plathottam @.***> ha scritto:
@00sapo https://github.com/00sapo I have a program that is spawned using the subprocess module from the main process. The program (i.e. each subprocess) calls PyJulia and diffeqpy py. However, I run into out ReadOnlyMemoryError() errors when the number of sub-process is greater than 50 (less than that it works fine). My question is whether your script could help in the stable creation of large number of Julia instances linked to the subprocess.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaPy/pyjulia/issues/472#issuecomment-991256336, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFPOII5OHKFDCH4H2OPWJ4DUQJL6DANCNFSM5G42OHVA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@00sapo I am not using the Joblib Parallel method for spawning the Julia processes. So, I am not sure how I can set memmap="w+".