EconML
EconML copied to clipboard
cannot allocate memory with CausalAnalysis().fit()
OS Architecture:
Operating System: Ubuntu 20.04.4 LTS Kernel: Linux 5.15.0-53-generic Architecture: x86-64
With a DataFrame of size `(rows, columns) = (9000, 102)`, the `fit` method I often get the cannot allocate memory
from econml.solutions.causal_analysis import CausalAnalysis import warnings warnings.simplefilter(action='ignore') ca = CausalAnalysis(feature_inds=top_features, categorical=categorical, heterogeneity_inds=None, classification=True, nuisance_models="automl", heterogeneity_model="forest", n_jobs=-1, random_state=1234)
ca.fit(X, y)
As a temporary solutuion and some research I have managed a bandaid solution with following linux commands but did not solve the issue
swapon --show sudo swapoff /swapfile sudo fallocate -l 5G /swapfile ls -lh /swapfile
Initially I had `1G `allocated. With the above syntax, now the swap file has `5GB` which did not solve the issue
My issue is if the fit function fails with small dataset with 9000 rows while plenty of the unallocated memory available (I have ram with 64 GB, unused memory about 20GB), has there been similar issue reported?
Some references:
https://stackoverflow.com/questions/5306075/python-memory-allocation-error-using-subprocess-popen
https://stackoverflow.com/questions/20111242/how-to-avoid-errno-12-cannot-allocate-memory-errors-caused-by-using-subprocess
https://stackoverflow.com/questions/20111242/how-to-avoid-errno-12-cannot-allocate-memory-errors-caused-by-using-subprocess
Full stack trace
File "/home/sambashare/newertext/thesis/docs/code/000_pyGraph/gb_rf_xgb.py", line 123, in <module>
xgn.ift_xgb_hpSearch(df, year, model_path=model_path, shap_plot_path=shap_plot_path,
File "/home/sambashare/newertext/thesis/docs/code/000_pyGraph/xgb_nov_tg.py", line 986, in ift_xgb_hpSearch
causalAnl(shap_values=contributions, train=train, test=test, title= rfhpSearch+'_Best_Model',
File "/home/sambashare/newertext/thesis/docs/code/000_pyGraph/xgb_nov_tg.py", line 214, in causalAnl
ca.fit(X, y)
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/econml/solutions/causal_analysis/_causal_analysis.py", line 867, in fit
joblib.Parallel(
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/parallel.py", line 1043, in __call__
if self.dispatch_one_batch(iterator):
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
self._dispatch(tasks)
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/parallel.py", line 779, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 531, in apply_async
future = self._workers.submit(SafeFunction(func))
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/reusable_executor.py", line 177, in submit
return super(_ReusablePoolExecutor, self).submit(
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 1135, in submit
self._ensure_executor_running()
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 1109, in _ensure_executor_running
self._adjust_process_count()
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 1100, in _adjust_process_count
p.start()
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/process.py", line 39, in _Popen
return Popen(process_obj)
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/popen_loky_posix.py", line 52, in __init__
self._launch(process_obj)
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/popen_loky_posix.py", line 157, in _launch
pid = fork_exec(cmd_python, self._fds, env=process_obj.env)
File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/fork_exec.py", line 43, in fork_exec
pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
causalAnl is the
CausalAnalysis(feature_inds=top_features, categorical=categorical,
heterogeneity_inds=None,
classification=True,
nuisance_models="automl",
heterogeneity_model="forest",
n_jobs=-1,
random_state=1234)
Thanks, we'll take a look. How many entries are in your top_features?
I have tested with 5, 10, 15 and 10.
best Krishna
On Tue, Dec 13, 2022 at 10:31 AM Keith Battocchi @.***> wrote:
Thanks, we'll take a look. How many entries are in your top_features?
— Reply to this email directly, view it on GitHub https://github.com/microsoft/EconML/issues/707#issuecomment-1348810375, or unsubscribe https://github.com/notifications/unsubscribe-auth/APBUYJUFOCKKCR6CYS6F6DTWNCJFVANCNFSM6AAAAAASPDVIFM . You are receiving this because you authored the thread.Message ID: @.***>