deep-martin
deep-martin copied to clipboard
Bump datasets from 1.6.2 to 2.6.1
Bumps datasets from 1.6.2 to 2.6.1.
Release notes
Sourced from datasets's releases.
2.6.1
Bug fixes
- Fix filter indices when batched by
@​albertvillanova
in huggingface/datasets#5113
- fixed a bug where
filter
could return examples with the wrong indices- Fix iter_batches by
@​lhoestq
in huggingface/datasets#5115
- fixed a bug where
map
withbatch=True
could return a dataset with less examples- Fix a typo in arrow_dataset.py by
@​yangky11
in huggingface/datasets#5108New Contributors
@​yangky11
made their first contribution in huggingface/datasets#5108Full Changelog: https://github.com/huggingface/datasets/compare/2.6.0...2.6.1
2.6.0
Important
- [GH->HF] Remove all dataset scripts from github by
@​lhoestq
in huggingface/datasets#4974
- all the dataset scripts and dataset cards are now on https://hf.co/datasets
- we invite users and contributors to open discussions or pull requests on the Hugging Face Hub from now on
Datasets features
- Add ability to read-write to SQL databases. by
@​Dref360
in huggingface/datasets#4928
- Read from sqlite file:
from datasets import Dataset dataset = Dataset.from_sql("data_table", "sqlite:///sqlite_file.db")
- Allow connection objects in
from_sql
+ small doc improvement by@​mariosasko
in huggingface/datasets#5091from datasets import Dataset from sqlite3 import connect con = connect(...) dataset = Dataset.from_sql("SELECT text FROM table WHERE length(text) > 100 LIMIT 10", con)
- Image & Audio formatting for numpy/torch/tf/jax by
@​lhoestq
in huggingface/datasets#5072
- return numpy/torch/tf/jax tensors with
from datasets import load_dataset ds = load_dataset("imagenet-1k").with_format("torch") # or numpy/tf/jax ds[0]["image"]
- Added
IterableDataset.from_generator
by@​hamid-vakilzadeh
in huggingface/datasets#5052- Fast dataset iter by
@​mariosasko
in huggingface/datasets#5030
- speed up by a factor of 2 using the Arrow Table reader
- Dataset infos in yaml by
@​lhoestq
in huggingface/datasets#4926
- you can now specify the feature types and number of samples in the dataset card, see https://huggingface.co/docs/datasets/dataset_card
- Add
kwargs
toDataset.from_generator
by@​mariosasko
in huggingface/datasets#5049- Support
converters
inCsvBuilder
by@​mariosasko
in huggingface/datasets#5057- Restore saved format state in
load_from_disk
by@​asofiaoliveira
in huggingface/datasets#5073
... (truncated)
Commits
1742cf1
Release: 2.6.1eadc79a
Fix iter_batches (#5115)d60f5ff
Fix filter indices when batched (#5113)3ad9644
Fix a typo in arrow_dataset.py (#5108)0d4e390
set dev versiondc3f72e
Release: 2.6.099680a7
Fix task template reload from dict (#5106)dc4c764
fix for evaluate 0.2.29ec6cc7
Free the "hf" filesystem protocol forhffs
(#5101)bbebe3f
url encode hub url (#5099) (#5103)- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
-
@dependabot rebase
will rebase this PR -
@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it -
@dependabot merge
will merge this PR after your CI passes on it -
@dependabot squash and merge
will squash and merge this PR after your CI passes on it -
@dependabot cancel merge
will cancel a previously requested merge and block automerging -
@dependabot reopen
will reopen this PR if it is closed -
@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually -
@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)