Quentin Lhoest comments

Results 416 comments of


                                            Quentin Lhoest

[GH->HF] Part 2: Remove all dataset scripts from github

We are deprecating the metrics in `datasets` indeed and suggest users to switch to `evaluate` (via a warning message) We'll keep the current metrics as they are for now, but...

[GH->HF] Part 2: Remove all dataset scripts from github

I guess this is ready to merge ? It should break nothing except one rare case: If someone is using an old version of `datasets` to try to load a...

[GH->HF] Part 2: Remove all dataset scripts from github

Let's merge this on monday if we can, to make sure contributors who wanted to merge their dataset PRs here could do it

[GH->HF] Part 2: Remove all dataset scripts from github

Alright, merging !

Inconsistent caching behaviour when using `Dataset.map()` with a `new_fingerprint` and `num_proc>1`

Following the discussion in #3045 if would be nice to have a way to let users have a nice experience with caching even if the function is not hashable. Currently...

Identical keywords in build_kwargs and config_kwargs lead to TypeError in load_dataset_builder()

Hi ! I think this can be fixed by letting the config_kwargs take over the builder kwargs here: https://github.com/huggingface/datasets/blob/7feeb5648a63b6135a8259dedc3b1e19185ee4c7/src/datasets/load.py#L1533-L1534 maybe something like this ? ```python **{**builder_kwargs, **config_kwargs} ``` Let me...

Quentin Lhoest

[GH->HF] Part 2: Remove all dataset scripts from github

[GH->HF] Part 2: Remove all dataset scripts from github

[GH->HF] Part 2: Remove all dataset scripts from github

[GH->HF] Part 2: Remove all dataset scripts from github

Inconsistent caching behaviour when using `Dataset.map()` with a `new_fingerprint` and `num_proc>1`

Identical keywords in build_kwargs and config_kwargs lead to TypeError in load_dataset_builder()

Identical keywords in build_kwargs and config_kwargs lead to TypeError in load_dataset_builder()

Identical keywords in build_kwargs and config_kwargs lead to TypeError in load_dataset_builder()

Move DatasetInfo from `datasets_infos.json` to the YAML tags in `README.md`

Move DatasetInfo from `datasets_infos.json` to the YAML tags in `README.md`