unitxt issues

Rouge might not be correct

Rouge in the old unitxt had some preprocessing that is not included here. (something to do with separation of sentences) this might affect the results. @gitMichal

elronbandel

Random split of a predefined size

Capp the maximum number of examples returned by the split random mix (e.g., who cares for a 5% of the examples of a 1trilion sentences for test)

borgr

enhancement

Use of cache after changing card returns stale results

I changed a card (added a preprocessing step), but the dataset was loaded from cache: 07/16/2023 13:49:32 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /Users/yoavkatz/cache/huggingface/datasets/unitxt___data/card=cards.sst2_sentiment,template_item=0/1.1.1/161c975966d35694e0db488ca61993c4a4cfb44975f0fa25e6aac6dc3806b97f/cache-d2a30425e116067b.arrow Need to...

yoavkatz

FormTask name is uninformative

Uninformative name, was it meant to be multipleChoice?

borgr

ease-of-use

Add relation extraction

6

Adding support for relation-extraction task.

pklpriv

Add new Loader from Huggingface space

Some data is available in Huggingface spaces and not HF datasets. We'd like a custom loader from Huggingface spaces. ``` class LoadFromHFSpace(Loader): user_name: str space_name:str data_files: Mapping[str, str]] _requirements_list: List[str]...

yoavkatz

Implementation of select safety benchmarks

1

Implementation of select safety benchmarks used in the MLCommons AI Safety Benchmark (https://mlcommons.org/working-groups/ai-safety/ai-safety/). Based on code at https://github.com/mlcommons/modelgauge. Signed-off-by: Jonathan Bnayahu

bnayahu

Align Bold and Attaq card with a documented catalog task

These tasks are currently using undocumented templates and tasks, which make it harder for people to use. Also it is not browsable by the exploration UI. https://github.com/IBM/unitxt/blob/main/prepare/cards/bold.py https://github.com/IBM/unitxt/blob/main/prepare/cards/atta_q.py

elronbandel

DeprecatedField triggers a warning when using a class with it even when the user did not use the deprecated field

8

elronbandel

Update adding_dataset.rst - a few more minor documentation changes

welisheva22

unitxt
unitxt copied to clipboard

Metadata

Rouge might not be correct

Random split of a predefined size

Use of cache after changing card returns stale results

FormTask name is uninformative

Add relation extraction

Add new Loader from Huggingface space

Implementation of select safety benchmarks

Align Bold and Attaq card with a documented catalog task

DeprecatedField triggers a warning when using a class with it even when the user did not use the deprecated field

Update adding_dataset.rst - a few more minor documentation changes

← Metadata

Owner

Metadata

unitxt unitxt copied to clipboard

Metadata

← Metadata

Owner

Metadata

unitxt
unitxt copied to clipboard