unitxt
unitxt copied to clipboard
🦄 Unitxt: a python library for getting data fired up and set for training and evaluation
In templates.py, MultiReferenceTemplate derives from InputOutputTemplate. But MultiReferenceTemplate.outputs_to_target_and_references() doesn't call the base class to render the target using `output_format`. As a result, you just get the target mirrored when running...
Currently, when running bertScore, the report is on [“f1”, “precision”, “recall”]. This might be not very clear when expecting BertScore to appear in the metric name. Especially, when they use...
Why is this the accepted behavior (strict=False was set a long time ago)? ******************************************************************************** The results of running the main metric in used in the card (matthews_correlation) over simulated predictions...
Change xlsum.py to run on all languages (remove if `lang == langs[0]:`) Run `python prepare/cards/xlsum.py` Traceback (most recent call last): File "/home/runner/work/unitxt/unitxt/tests/test_preperation.py", line 47, in test_preprations import_module_from_file(file) File "/home/runner/work/unitxt/unitxt/tests/test_preperation.py", line...
This is the root cause: https://github.com/IBM/unitxt/blob/main/src/unitxt/test_utils/metrics.py#L75 I think we need to test if the inner metric in MetricPipeline is GlobalMetri object and if so, set the n_resamples=3 to it.
Today confidence intervals are computed by default for the main_score. This [PR](https://github.com/IBM/unitxt/pull/431) adds the capability of computing confidence intervals for additional scores. We would like to change the confidence interval...
There are few open issues: There is no multi_label template (fix required to unfair_tos and reuters) Can I use text_type : argument? I wonder if dbpedia_14 is of type text...
see: https://github.com/IBM/unitxt/pull/403
@eladven @matanor In test_utils/metrics.py/test_metric, for a GlobalMetric we have ```` if isinstance(metric, GlobalMetric) and metric.n_resamples: metric.n_resamples3 # Use a low number of resamples in testing for GlobalMetric, to save runtime...
Loaders need it, do other operators? https://github.com/IBM/unitxt/pull/339