Reranker
Reranker copied to clipboard
Datasets.load_dataset breaks with Python 3.9
Error: if python 3.9 is installed, the setup command will install Pandas 1.3.0 because older versions of Pandas are not compatible with Python 3.9. This Pandas version doesn't accept the following call:
read_csv("file.csv", names=None, prefix=None)
breaking the load_dataset
function when used with the csv script.
The function call bellow in build_train_from_ranking.py
will output the following error message: "ValueError: Specified named and prefix; you can only specify one."
train_doc_collection = datasets.load_dataset(
path='csv',
data_files=collection_path,
column_names=columns,
delimiter='\t',
ignore_verifications=True,
)['train']
That is because the last Pandas update doesn't accept None as parameter, only pandas.lib.no_default constant as per issue #42387.
Downgrading to Python 3.8 and Pandas 1.0.4 corrects the problem.
I believe python 3.8 should be enforced.
I will take look. In particular, I want to know if this is a regression due to outdated datasets package. Can you print datasets.__version__
?
Hi ! This is an issue with pandas
1.3.0, please update datasets
or use an older version of pandas
until this is fixed