deep-martin Bump sacrebleu from 1.5.1 to 2.3.1

Bump sacrebleu from 1.5.1 to 2.3.1

Open dependabot[bot] opened this issue 2 years ago • 0 comments

Bumps sacrebleu from 1.5.1 to 2.3.1.

Release notes

v2.3.0

Features:

(#203) Added -tok flores101 and -tok flores200, a.k.a. spbleu. These are multilingual tokenizations that make use of the multilingual SPM models released by Facebook and described in the following papers:

Flores-101: https://arxiv.org/abs/2106.03193

Flores-200: https://arxiv.org/abs/2207.04672

(#213) Added JSON formatting for multi-system output (thanks to Manikanta Inugurthi @me-manikanta)

(#211) You can now list all test sets for a language pair with --list SRC-TRG. Thanks to Jaume Zaragoza (@ZJaume) for adding this feature.

Added WMT22 test sets (test set wmt22)

System outputs: include with wmt22. Also added wmt21/systems which will produce WMT21 submitted systems. To see available systems, give a dummy system to --echo, e.g., sacrebleu -t wmt22 -l en-de --echo ?

v2.2.0

This release contains an inner reworking of the data representations, contributed by @BrightXiaoHan. This enables the following features:

Added WMT21 datasets (which are properly XML-encoded)

Exposed corpus metadata via --echo (including origlang, docid, and genre, which are all available for most WMT corpora)

We also added a Korean tokenizer (--tok ko-mecab), contributed by @NoUnique.

In addition, there are a number of bug fixes and minor fixes:

Empty references (#161) are now allowed. Some of our speech test sets could not be used before this was fixed!

We now recommend that people use the spm tokenizer, particularly for CJK languages.

Internally, the tarball downloads and extracted test and metadata files now have names that are globally unique (e.g., .sacrebleu/wmt21/wmt_21.en-de.ref instead of .sacrebleu/wmt21/de-en.ref. The file extension corresponds to the field that gets passed to --echo.

v2.0.0

This is a major release that introduces statistical significance testing for BLEU, chrF and TER. It should be noted that as of v2.0.0, the default output format of the CLI utility is json rather than the old single-line output. All tools should adapt to this change if they parse standard output.

Build: Add Windows and OS X testing to github workflow

Improve documentation and type annotations.

Drop Python < 3.6 support and migrate to f-strings.

Drop input type manipulation through isinstance checks. If the user does not obey to the expected annotations, exceptions will be raised. Robustness attempts lead to confusions and obfuscated score errors in the past (fixes #121)

Use colored strings in tabular outputs (multi-system evaluation mode) through the help of colorama package.

tokenizers: Add caching to tokenizers which seem to speed up things a bit.

intl tokenizer: Use regex module. Speed goes from ~4 seconds to ~0.6 seconds for a particular test set evaluation. (fixes #46)

Signature: Formatting changed (mostly to remove '+' separator as it was interfering with chrF++). The field separator is now '|' and key values are separated with ':' rather than '.'.

Metrics: Scale all metrics into the [0, 100] range (fixes #140)

BLEU: In case of no n-gram matches at all, skip smoothing and return 0.0 BLEU (fixes #141).

BLEU: allow modifying max_ngram_order (fixes #156)

CHRF: Added multi-reference support, verified the scores against chrF++.py, added test case.

... (truncated)

Changelog

Sourced from sacrebleu's changelog.

2.3.1 (2022-10-18) Bugfix:

Set lru_cache to 2**16 for SPM tokenizer (was set to infinite)

2.3.0 (2022-10-18) Features:

(#203) Added -tok flores101 and -tok flores200, a.k.a. spbleu. These are multilingual tokenizations that make use of the multilingual SPM models released by Facebook and described in the following papers:

Flores-101: https://arxiv.org/abs/2106.03193

Flores-200: https://arxiv.org/abs/2207.04672

(#213) Added JSON formatting for multi-system output (thanks to Manikanta Inugurthi @me-manikanta)

(#211) You can now list all test sets for a language pair with --list SRC-TRG. Thanks to Jaume Zaragoza (@ZJaume) for adding this feature.

Added WMT22 test sets (test set wmt22)

System outputs: include with wmt22. Also added wmt21/systems which will produce WMT21 submitted systems. To see available systems, give a dummy system to --echo, e.g., sacrebleu -t wmt22 -l en-de --echo ?

2.2.1 (2022-09-13) Bugfix: Standard usage was returning (and using) each reference twice.

2.2.0 (2022-07-25) Features:

Added WMT21 datasets (thanks to @BrighXiaoHan)

--echo now exposes document metadata where available (e.g., docid, genre, origlang)

Bugfix: allow empty references (#161)

Adds a Korean tokenizer (thanks to @NoUnique)

Under the hood:

Moderate code refactoring

Processed files have adopted a more sensible internal naming scheme under ~/.sacrebleu (e.g., wmt17_ms.zh-en.src instead of zh-en.zh)

Processed file extensions correspond to the values passed to --echo (e.g., "src")

Now explicitly representing NoneTokenizer

Got rid of the ".lock" lockfile for downloading (using the tarball itself)

Many thanks to @BrightXiaoHan (https://github.com/BrightXiaoHan) for the bulk of the code contributions in this release.

2.1.0 (2022-05-19) Features:

Added -tok spm for multilingual SPM tokenization (#168) (thanks to Naman Goyal and James Cross at Facebook)

Fixes:

Handle potential memory usage issues due to LRU caching in tokenizers (#167)

Bugfix: BLEU.corpus_score() now using max_ngram_order (#173)

Upgraded ja-mecab to 1.0.5 (#196)

... (truncated)

Commits

c7fc166 increment version in CHANGELOG
a885403 Set lru_cache for SPM tokenizer
5166cf7 Added WMT22 data (closes #215) (#216)
e416ee2 Added wmt21/systems for system outputs (#214)
37de171 Add support to print multi system results as JSON (#213)
1d04b6f Ability to filter available testsets by language pair (#211)
38eaf95 Added flores200 spm model (#203)
1142ba4 Multiref bugfix (#204)
a73315b Version 2.2 (#200)
8e7abf5 use TypeError/ValueError for argument checking errors (closes #189) (#190)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Oct 24 '22 19:10 dependabot[bot]

deep-martin deep-martin copied to clipboard

Bump sacrebleu from 1.5.1 to 2.3.1

v2.3.0

v2.2.0

v2.0.0

deep-martin
deep-martin copied to clipboard