ray
ray copied to clipboard
[Data] - Reading zipped JSONL files results in error.
What happened + What you expected to happen
What happened
I ran the script below to read in data in zipped JSONL format and ran into this error:
ValueError: No input files found to read with the following file extensions: ['json', 'jsonl']. Please double check that 'file_extensions' field is set properly.
What you expected to happen
That ray.data
allows to read in zipped JSONL files out-of-the-box.
Versions / Dependencies
Python 3.7.11 Linux Fedora 40 Ray Master
Reproduction script
Execute this script from your ray
root folder.
from pathlib import Path
import ray
base_path = Path(__file__).parent / "rllib"
data_path = base_path / "tests/data/pendulum/enormous.zip"
# Read in the `SampleBatch` data using `ray.data.read_json`.
ds = ray.data.read_json(data_path.as_posix())
Issue Severity
High: It blocks me from completing my task.