dafnapension

Results 15 issues of dafnapension

Hi @michal-jacovi and @elronbandel , Just to kickoff, I tweaked test_card, making it invoke load_dataset_builder (rather than LoadHF), and printed the description, citation, homepage, and whatever that load_dataset_builder harvested for...

The new feature of Metric, sample-from-groups-scores, employs the CI over instances generated ad-hoc, one per group (a group is a subset of the input instances, whose member instances are those...

Suggest a scheme of "aggregator" to be similarly employed by all three types of MetricsWithConfidenceInterval.

Hi @elronbandel , I had this in mind re the issue you assigned to me. Is this in the direction you had in mind? If yes, I will clean the...

Still without success..

`'lmsys/arena-hard-browser'` was gone from HF. `test_preparation` does not catch faulty `prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py` that tries to access this (gone) dataset, because a missing dataset is considered by `test_preparation` an error to be...

`__type__` in catalog is expressed as a dict {`module`: module, `name`: class_name}, therefrom classes are instantiated through python's import utils. This means that if a class `c` is defined in...