Issue loading datasets -- pyarrow.lib has no attribute
Describe the bug
I am trying to load sentiment analysis datasets from huggingface, but any dataset I try to use via load_dataset, I get the same error:
AttributeError: module 'pyarrow.lib' has no attribute 'IpcReadOptions'
Steps to reproduce the bug
dataset = load_dataset("glue", "cola")
Expected results
Download datasets without issue.
Actual results
AttributeError: module 'pyarrow.lib' has no attribute 'IpcReadOptions'
Environment info
datasetsversion: 2.3.2- Platform: macOS-10.15.7-x86_64-i386-64bit
- Python version: 3.8.5
- PyArrow version: 8.0.0
- Pandas version: 1.1.0
Hi @margotwagner, thanks for reporting.
Unfortunately, I'm not able to reproduce your bug: in an environment with datasets-2.3.2 and pyarrow-8.0.0, I can load the datasets without any problem:
>>> ds = load_dataset("glue", "cola")
>>> ds
DatasetDict({
train: Dataset({
features: ['sentence', 'label', 'idx'],
num_rows: 8551
})
validation: Dataset({
features: ['sentence', 'label', 'idx'],
num_rows: 1043
})
test: Dataset({
features: ['sentence', 'label', 'idx'],
num_rows: 1063
})
})
>>> import pyarrow
>>> pyarrow.__version__
8.0.0
>>> from pyarrow.lib import IpcReadOptions
>>> IpcReadOptions
pyarrow.lib.IpcReadOptions
I think you may have a problem in your Python environment: maybe you have also an old version of pyarrow that has precedence when importing it.
Could you please check this (just after you tried to load the dataset and got the error)?
>>> import pyarrow
>>> pyarrow.__version__