Made-With-ML OSError: [Errno 30] Cannot create directory '/efs'. Detail: [errno 30] Read-only file system

@GokuMohandas can you help me figure this out

Oct 26 '23 01:10 bhavya-giri

Screenshot 2023-10-30 at 7 58 46 AM Screenshot 2023-10-30 at 7 59 08 AM

Oct 30 '23 02:10 bhavya-giri

same error here, have you managed to resolve it?

Nov 12 '23 12:11 Meryl-Fang

I am having the same issue and no idea why. basically it is unable to load function from madewilml/data directory. A hack that worked for me is to create and run the following code cell above this erroneous code cell

import re
from typing import Dict, List, Tuple

import numpy as np
import pandas as pd
import ray
from ray.data import Dataset
from sklearn.model_selection import train_test_split
from transformers import BertTokenizer

def stratify_split(
    ds: Dataset,
    stratify: str,
    test_size: float,
    shuffle: bool = True,
    seed: int = 1234,
) -> Tuple[Dataset, Dataset]:
    """Split a dataset into train and test splits with equal
    amounts of data points from each class in the column we
    want to stratify on.

    Args:
        ds (Dataset): Input dataset to split.
        stratify (str): Name of column to split on.
        test_size (float): Proportion of dataset to split for test set.
        shuffle (bool, optional): whether to shuffle the dataset. Defaults to True.
        seed (int, optional): seed for shuffling. Defaults to 1234.

    Returns:
        Tuple[Dataset, Dataset]: the stratified train and test datasets.
    """

    def _add_split(df: pd.DataFrame) -> pd.DataFrame:  # pragma: no cover, used in parent function
        """Naively split a dataframe into train and test splits.
        Add a column specifying whether it's the train or test split."""
        train, test = train_test_split(df, test_size=test_size, shuffle=shuffle, random_state=seed)
        train["_split"] = "train"
        test["_split"] = "test"
        return pd.concat([train, test])

    def _filter_split(df: pd.DataFrame, split: str) -> pd.DataFrame:  # pragma: no cover, used in parent function
        """Filter by data points that match the split column's value
        and return the dataframe with the _split column dropped."""
        return df[df["_split"] == split].drop("_split", axis=1)

    # Train, test split with stratify
    grouped = ds.groupby(stratify).map_groups(_add_split, batch_format="pandas")  # group by each unique value in the column we want to stratify on
    train_ds = grouped.map_batches(_filter_split, fn_kwargs={"split": "train"}, batch_format="pandas")  # combine
    test_ds = grouped.map_batches(_filter_split, fn_kwargs={"split": "test"}, batch_format="pandas")  # combine

    # Shuffle each split (required)
    train_ds = train_ds.random_shuffle(seed=seed)
    test_ds = test_ds.random_shuffle(seed=seed)

    return train_ds, test_ds

Basically instead of importing it which it is failing to do so (no idea why) we are directly using the function in the notebook

Nov 12 '23 17:11 taaha

But the same error would come in training, check this repo https://github.com/GokuMohandas/mlops-course

Nov 12 '23 17:11 bhavya-giri

As the error message indicated, this error caused by the permission related to /efs folder, you are creating. I assume you use your own local machine. I edited like below, and it worked in my local environment, Mac OS (14.1.2) and Python 3.10.11. The path would be different, depending on where your directory located. I hope this might help you.

config.py Change line 13: EFS_DIR = Path(f"/Users/<your_user_name>/efs/shared_storage/madewithml/{os.environ.get('GITHUB_USERNAME', '')}")
madewithml.ipynb Change the codes in Setup section: EFS_DIR = f"/Users/<your_user_name>/efs/shared_storage/madewithml/{os.environ['GITHUB_USERNAME']}"

Dec 03 '23 11:12 gOsuzu

Made-With-ML Made-With-ML copied to clipboard

OSError: [Errno 30] Cannot create directory '/efs'. Detail: [errno 30] Read-only file system

Made-With-ML
Made-With-ML copied to clipboard