pandarallel icon indicating copy to clipboard operation
pandarallel copied to clipboard

OSError when << use_memory_fs=True>>

Open pratikchhapolika opened this issue 4 years ago • 1 comments

%load_ext autoreload
%autoreload 2
import pandas as pd
import time
from pandarallel import pandarallel
import math
import numpy as np
pandarallel.initialize(nb_workers=60,progress_bar=True)


df_size = int(5e7)
df = pd.DataFrame(dict(a=np.random.rand(df_size) + 1))
df.head()

def func(x):
    return math.log10(math.sqrt(math.exp(x**2)))

%%time
df['b'] = df.a.parallel_map(func)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/pandarallel/pandarallel.py in get_workers_args(use_memory_fs, nb_workers, progress_bar, chunks, worker_meta_args, queue, func, args, kwargs)
    254                 dump_and_get_lenght(chunk, input_file)
--> 255                 for chunk, input_file in zip(chunks, input_files)
    256             ]

/opt/conda/lib/python3.6/site-packages/pandarallel/pandarallel.py in <listcomp>(.0)
    254                 dump_and_get_lenght(chunk, input_file)
--> 255                 for chunk, input_file in zip(chunks, input_files)
    256             ]

/opt/conda/lib/python3.6/site-packages/pandarallel/pandarallel.py in dump_and_get_lenght(chunk, input_file)
    243         with open(input_file.name, "wb") as file:
--> 244             pickle.dump(chunk, file)
    245 

OSError: [Errno 28] No space left on device

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
/usr/local/bin/kernel-launchers/python/scripts/launch_ipykernel.py in <module>

/opt/conda/lib/python3.6/site-packages/pandarallel/pandarallel.py in closure(data, func, *args, **kwargs)
    425             func,
    426             args,
--> 427             kwargs,
    428         )
    429         try:

/opt/conda/lib/python3.6/site-packages/pandarallel/pandarallel.py in get_workers_args(use_memory_fs, nb_workers, progress_bar, chunks, worker_meta_args, queue, func, args, kwargs)
    270                 )
    271             )
--> 272             raise OSError(msg)
    273 
    274         workers_args = [

OSError: It seems you use Memory File System and you don't have enough available space in dev/shm. You can either call pandarallel.initalize with use_memory_fs=False, or you can increase the size of dev/shmas described here: https://stackoverflow.com/questions/58804022/how-to-resize-dev-shm . Please also remove all files beginning with 'pandarallel_' in the/dev/shmdirectory. If you have troubles with your web browser, these troubles should deseapper after cleaning/dev/shm.

Question: I get error when I don't pass use_memory_fs=False. If I want to use memory how can I overcome this?

pratikchhapolika avatar Aug 10 '21 12:08 pratikchhapolika

@pratikchhapolika According to the docstring of initialize()

Memory file system is considered as available only if the
directory `/dev/shm` exists and if the user has read and write
permission on it.

Basically memory file system is only available on some Linux
distributions (including Ubuntu)

That said, what's your OS and does it have memory fs?

pratik-choudhari avatar Sep 17 '21 04:09 pratik-choudhari

Closed (no activity)

till-m avatar Aug 22 '22 09:08 till-m