Quentin Lhoest

Results 416 comments of Quentin Lhoest

What's the error you're getting @Eric2i ? On my side on 10.3.0 I could run this without errors: ```python import PIL.Image PIL.Image.ExifTags.Base.Orientation is not None # True ```

hfh 0.23.1 and transformers 4.41.0 as are out out, let's unpin no ?

The errors were coming from `transformers` having FutureWarning when loading models or tokenizers. I disabled the warnings for the `transformers`-related calls since they're not related to `datasets`

It's because the error from the FutureWarning happened when running `cache_file()` from `transformers`, which has some code that try/except and re-raise an OSError

alright I disabled the errors on FutureWarning, do you see anything else @albertvillanova or we can merge ?

Hi ! Thanks for diving into this, this conversion to python lists is indeed quite slow. Array2DExtensionType and Array3DExtensionType currently rely on pyarrow lists, but we will soon modify them...

No one is working on this atm afaik (and actually we don't have any ETA unfortunately). To do this change I think we need to: - update the `_ArrayXD` parent...

Nice, thanks @Modexus !

Can we start using FixedShapeTensor or FixedSizeList even if pandas/polars don't support them fully yet ? We would still get the benefit of optimized conversion to numpy

Cool ! `filter` will be very useful. There can be a filter that you can apply on a streaming dataset: ```python load_dataset(..., streaming=True).filter(lambda x: x["lang"] == "sw") ``` Otherwise if...