Quentin Lhoest

Results 416 comments of Quentin Lhoest

At HF we want to make the Hub more open and support more data formats and libraries. We recently added support for WebDataset for example, and there are hundreds of...

> Just following up on this; @karan6181 @lhoestq -- my understanding is that the HF Hub exposes dataset repositories via an fsspec API: https://huggingface.co/docs/huggingface_hub/main/en/guides/hf_file_system > From the Mosaic Streaming perspective...

Wow amazing ! are there some docs already on how to use it ? Also let me know if you plan to share this on social media, I'll be happy...

This is also causing bugs in `datasets` when loading datasets with many files, e.g. `load_dataset('mteb/biblenlp-corpus-mmteb')`: ``` huggingface_hub.utils._errors.HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/datasets/mteb/biblenlp-corpus-mmteb/paths-info/3912ed967b0834547f35b2da9470c4976b357c9a ``` could you take...

It's been in prod in datasets-viewer and it fixes the HfHubHTTPError (Too Many Requests) both for the FineWeb's viewer and also for loading the mmteb datasets in `datasets`