prophet Unable to run cross-validation in parallel mode "processes"

Unable to run cross-validation in parallel mode "processes"

Open nviet opened this issue 3 years ago • 11 comments

Hello, I'm using Prophet v1.0 with Anaconda3 2020.11 on Windows 10 64-bit. I'm trying to run cross-validation in parallel mode "processes" using the example provided in the documentation, but I always get this error message (the error.log is very long so I attached it instead of pasting it here). The code I used:

import pandas as pd
import itertools
import numpy as np
from prophet import Prophet
from prophet.diagnostics import cross_validation
from prophet.diagnostics import performance_metrics

df = pd.read_csv("example_wp_log_peyton_manning.csv")

param_grid = {  
    "changepoint_prior_scale": [0.001, 0.01, 0.1, 0.5],
    "seasonality_prior_scale": [0.01, 0.1, 1.0, 10.0],
}

# Generate all combinations of parameters
all_params = [dict(zip(param_grid.keys(), v)) for v in itertools.product(*param_grid.values())]
rmses = []  # Store the RMSEs for each params here

# Use cross validation to evaluate all parameters
for params in all_params:
    m = Prophet(**params).fit(df)  # Fit model with given params
    df_cv = cross_validation(m, horizon="30 days", parallel="processes")
    df_p = performance_metrics(df_cv, rolling_window=1)
    rmses.append(df_p["rmse"].values[0])

# Find the best parameters
tuning_results = pd.DataFrame(all_params)
tuning_results["rmse"] = rmses
print(tuning_results)

If I run the code on Google Colab then everything is fine.

So can anyone help please? Thank you.

Apr 27 '21 17:04 nviet

Does it work if you run it with parallel set to None or to 'threads'? It's really hard to debug issues with parallel processing. The key part of the error message seems to be:

        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:


            if __name__ == '__main__':
                freeze_support()
                ...

but I'm not quite sure what is to be made of that.

As a side note, I did notice the line INFO:prophet:Making 172 forecasts with cutoffs between 2008-12-12 00:00:00 and 2015-12-21 00:00:00 which is a really large number of forecasts for cross validation and will probably be super slow. I'd recommend increasing the "initial" and/or "period" inputs to cross_validation to get that down to something that won't take so long to run.

Apr 28 '21 00:04 bletham

Yes the code works if I run it with parallel set to None or to threads. I also tried the code on another computer of mine with same environment, but the result is still the same. If I try it on Google's Colab then the code runs just fine.

Also thank you for the suggestion on the forecasting parameters. I will try to run the code without Anaconda to see if the error still happens.

Apr 28 '21 02:04 nviet

I guess the workaround is probably to use threads, I haven't run into this myself in Linux and I probably won't be able to debug the issue.

Apr 28 '21 23:04 bletham

According to #1434, it seems running cross-validation with parallel set to threads is much slower than setting it to processes. I'm unable to build prophet on Windows so my another workaround is to use the WSL. In my case there is a huge difference in term of execution time:

Cross-validation with parallel set to processes in WSL (Debian, without Anaconda): 0:02:44.19 (100% CPU usage)
Cross-validation with parallel set to threads in Windows (with Anaconda): 0:08:40.13 (only around 50% CPU usage)

Hope that in the future there will be someone who can help to debug this issue.

May 01 '21 17:05 nviet

@nviet I just saw #1889 that seems like it might be related. A solution presented there was to do

import multiprocessing
multiprocessing.set_start_method("fork")

prior to importing prophet. Could you see if that works here?

May 10 '21 22:05 bletham

Thanks for your suggestion. Unfortunately the fork value is available on Unix only and it's the default on Unix. On Windows the only available value is spawn. This information is available in the Python's official document too.

Trying to set the argument to fork in Windows will result in this error message:

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    multiprocessing.set_start_method("fork")
  File "d:\Programs\anaconda3\envs\myenv\lib\multiprocessing\context.py", line 246, in set_start_method
    self._actual_context = self.get_context(method)
  File "d:\Programs\anaconda3\envs\myenv\lib\multiprocessing\context.py", line 238, in get_context
    return super().get_context(method)
  File "d:\Programs\anaconda3\envs\myenv\lib\multiprocessing\context.py", line 192, in get_context
    raise ValueError('cannot find context for %r' % method) from None
ValueError: cannot find context for 'fork'

The issue lies in the line pool = concurrent.futures.ProcessPoolExecutor() in the file diagnostics.py. As shared in an answer to a question on StackOverflow on parallelism on Windows:

Multiprocessing works differently on ms-windows because that OS lacks the fork system call used on UNIX and macOS.

fork creates the child process as a perfect copy of the parent process. All the code and data in both processes are the same. The only difference being the return value of the fork call. (That is to let the new process know it is a copy.) So the child process has access to (a copy of) all the data in the parent process.

On ms-windows, multiprocessing tries to "fake" fork by launching a new python interpreter and have it import your main module. This means (among other things) that your main module has to be importable without side effects such as starting another process. Hence the reason for if __name__ == '__main__'. It also means that your worker processes might or might not have access to data created in the parent process, depending on where it is created. It will have access to anything created before __main__. But it would not have access to anything created inside the main block.

May 11 '21 14:05 nviet

Facing the same issue on Mac.

Regarding the 100% vs 50% CPU utilization, could the problem be that Windows reports all virtual cores (double the number of physical cores for CPU's with hyperthreading)? In my experience with scientific computing, using e.g. 8 cores on a 4 core machine with hyperthreading yields either no benefit or an outright decrease in speed, compared to using just the 4 cores. Are the training time the roughly the same in vanilla Windows as in WSL (WSL2?)?

Jan 22 '22 12:01 AllanLRH

I had a similar issue when using fb prophet and my solution was analogous to changing:

for params in all_params:
    m = Prophet(**params).fit(df)  # Fit model with given params
    df_cv = cross_validation(m, horizon="30 days", parallel="processes")
    df_p = performance_metrics(df_cv, rolling_window=1)
    rmses.append(df_p["rmse"].values[0])``

for params in all_params:
    if __name__ == '__main__':
       m = Prophet(**params).fit(df)  # Fit model with given params
       df_cv = cross_validation(m, horizon="30 days", parallel="processes")
       df_p = performance_metrics(df_cv, rolling_window=1)
       rmses.append(df_p["rmse"].values[0])

Mar 07 '22 17:03 Hilgaren

This turned out to be a bug with process writing multiple tempfiles to the same directory. This PR fixed the bug https://github.com/facebook/prophet/pull/2088/files, and Prophet 1.1 has been released

Please try pip install --upgrade prophet

Jun 27 '22 14:06 akosfurton

This issue still persists after upgrading prophet.

Jun 29 '22 08:06 Beginerxyz

Digging into this a bit, here's a reproducible snippet that replicates what Prophet is doing, but without prophet. This errors on Windows:

import concurrent.futures


def times_two(val):
    return val * 2


pool = concurrent.futures.ProcessPoolExecutor()

vals = [1, 2, 3, 4]

res = pool.map(times_two, vals)

The error:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

The problem is, the __main__ module gets imported and an infinite loop is created if the user code doesn't have if __name__ == '__main__': in it.

If you're not routinely testing the package across platforms, it might be a good idea to use something like joblib (as scikit-learn does), I get the feeling that they've been through all these pains and worked things out. Or use loky directly.

Sep 02 '22 02:09 davidgilbertson

prophet prophet copied to clipboard

Unable to run cross-validation in parallel mode "processes"

prophet
prophet copied to clipboard