datasets icon indicating copy to clipboard operation
datasets copied to clipboard

`from_parquet` return type annotation

Open saiden89 opened this issue 4 months ago • 0 comments

Describe the bug

As already posted in https://github.com/microsoft/pylance-release/issues/6534, the correct type hinting fails when building a dataset using the from_parquet constructor. Their suggestion is to comprehensively annotate the method's return type to better align with the docstring information.

Steps to reproduce the bug

from datasets import Dataset

dataset = Dataset.from_parquet(path_or_paths="file")
dataset.map(lambda x: {"new": x["old"]}, batched=True)

Expected behavior

map is a valid, no error should be thrown.

Environment info

  • datasets version: 3.0.1
  • Platform: macOS-15.0.1-arm64-arm-64bit
  • Python version: 3.12.6
  • huggingface_hub version: 0.25.1
  • PyArrow version: 17.0.0
  • Pandas version: 2.2.3
  • fsspec version: 2024.6.1

saiden89 avatar Oct 08 '24 09:10 saiden89