pandarallel icon indicating copy to clipboard operation
pandarallel copied to clipboard

Feature request: support for multiprocessing.Pool "initializer".

Open gwerbin opened this issue 1 year ago • 5 comments

The multiprocessing.Pool interface provides the ability to pass a custom "initializer" function to each worker: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.

This is useful for things like suppressing specific warnings, setting up logging, and other "scaffolding" that is occasionally useful (or even required) in applications.

It seems like Pandarallel does not use its own initializer function, so it would be nice if users could provide their own.

Looking over the code, this seems like a relatively unintrusive backward-compatible change that most users won't notice at all, but would benefit the small number of users who do want or need this feature.

Hypothetical usage:

def _suppress_shapely_warning(ignore: bool = True) -> None:
    import warnings
    warnings.filterwarnings(
        'ignore' if ignore else 'default',
        message='invalid value encountered in intersects',
        category=RuntimeWarning,
        module=r'shapely\.predicates',
        lineno=758,
        append=True,
    )

pandarallel.initialize(
    nb_workers=10,
    progress_bar=False,
    initializer=_suppress_shapely_warning,
    initargs=(True,),
)

gwerbin avatar Apr 14 '23 14:04 gwerbin

Hi @gwerbin,

as you may have noticed, both @nalepae and me are currently super busy. I'm trying to keep this package running as best as I can, but I don't really have the time to expand it. However, I very much see how this could be useful, and if you're willing to draft a PR I would gladly review and merge it. From scanning the code I also suspect it wouldn't be a big change.

till-m avatar Apr 14 '23 14:04 till-m

Yep, if you can provide a PR (with tests and docs) it could be super nice!

nalepae avatar Apr 14 '23 15:04 nalepae

Thanks @till-m and @nalepae! Happy to make a PR. I'll be busy for the next week followed by a short vacation, so I'll set a reminder for myself to look at it in early May.

gwerbin avatar Apr 18 '23 17:04 gwerbin

I realized this was a very small change, so I went ahead and created https://github.com/nalepae/pandarallel/pull/232

gwerbin avatar Apr 18 '23 17:04 gwerbin

Pandaral·lel is looking for a maintainer! If you are interested, please open an GitHub issue.

nalepae avatar Jan 23 '24 09:01 nalepae