smashed icon indicating copy to clipboard operation
smashed copied to clipboard

SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batching, and more. Supports datasets from Huggingface, torchdata ite...

Results 6 smashed issues
Sort by recently updated
recently updated
newest added

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4.6.1 to 4.7.1. Release notes Sourced from actions/setup-python's releases. v4.7.1 What's Changed Bump word-wrap from 1.2.3 to 1.2.4 by @​dependabot in actions/setup-python#702 Add range validation for toml...

dependencies
github_actions

Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4. Release notes Sourced from actions/checkout's releases. v4.0.0 What's Changed Update default runtime to node20 by @​takost in actions/checkout#1436 Support fetching without the --progress option...

dependencies
github_actions

Hi! With the introduction of Smashed, munging datasets of long documents is going to be a lot more fun ) This draft PR is simply to explore the idea below....

These are unique mappers since they read/write data from/to disc, so I made a new file for them. I think they can provide three primary benefits to the users: 1)...

- Maybe a decorator to add on top of each mapper - If you want to be fancy, even show which one is available for each interface - Create a...

documentation