pymoo icon indicating copy to clipboard operation
pymoo copied to clipboard

Issue with `JoblibParallelization`

Open cheginit opened this issue 1 year ago • 2 comments

The current implementation of JoblibParallelization that uses an already instantiated Parallel can lead to some issues such as this one. A better approach would be to take as input all Parallel args and instantiate it in the __call__ method. I tested this approach, and it works without any issue. I can open a PR if interested.

cheginit avatar Jan 01 '24 18:01 cheginit

Thanks for your feedback! Can you provide a small example here in this issue how the interface would look like? A PR works too of course.

blankjul avatar Jan 10 '24 03:01 blankjul

Sure. This is what I've been using:

class JoblibParallelization:
    def __init__(
        self,
        n_jobs: int = -1,
        backend: Literal["loky", "threading", "multiprocessing"] = "loky",
        return_as: Literal["list", "generator"] = "list",
        verbose: int = 0,
        timeout: float | None = None,
        pre_dispatch: str | int = "2 * n_jobs",
        batch_size: int | Literal["auto"] = "auto",
        temp_folder: str | Path | None = None,
        max_nbytes: int | str | None = "1M",
        mmap_mode: Literal["r+", "r", "w+", "c"] | None = "r",
        prefer: Literal["processes", "threads"] | None = None,
        require: Literal["sharedmem"] | None = None,
        *args: Any,
        **kwargs: Any,
    ) -> None:
        self.n_jobs = n_jobs
        self.backend = backend
        self.return_as = return_as
        self.verbose = verbose
        self.timeout = timeout
        self.pre_dispatch = pre_dispatch
        self.batch_size = batch_size
        self.temp_folder = temp_folder
        self.max_nbytes = max_nbytes
        self.mmap_mode = mmap_mode
        self.prefer = prefer
        self.require = require
        super().__init__()

    def __call__(
        self,
        f: Callable[..., Any],
        X: Iterable[Any],
    ) -> list[Any] | Generator[Any, Any, None]:
        with joblib.Parallel(
            n_jobs=self.n_jobs,
            backend=self.backend,
            return_as=self.return_as,
            verbose=self.verbose,
            timeout=self.timeout,
            pre_dispatch=self.pre_dispatch,
            batch_size=self.batch_size,
            temp_folder=self.temp_folder,
            max_nbytes=self.max_nbytes,
            mmap_mode=self.mmap_mode,
            prefer=self.prefer,
            require=self.require,
        ) as parallel:
            return parallel(joblib.delayed(f)(x) for x in X)

It can be easily instantiated without any arguments.

cheginit avatar Jan 10 '24 15:01 cheginit