dafnapension
dafnapension
Hi @michal-jacovi and @elronbandel , Just to kickoff, I tweaked test_card, making it invoke load_dataset_builder (rather than LoadHF), and printed the description, citation, homepage, and whatever that load_dataset_builder harvested for...
The new feature of Metric, sample-from-groups-scores, employs the CI over instances generated ad-hoc, one per group (a group is a subset of the input instances, whose member instances are those...
Suggest a scheme of "aggregator" to be similarly employed by all three types of MetricsWithConfidenceInterval.
Hi @elronbandel , I had this in mind re the issue you assigned to me. Is this in the direction you had in mind? If yes, I will clean the...
Still without success..
`'lmsys/arena-hard-browser'` was gone from HF. `test_preparation` does not catch faulty `prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py` that tries to access this (gone) dataset, because a missing dataset is considered by `test_preparation` an error to be...
`__type__` in catalog is expressed as a dict {`module`: module, `name`: class_name}, therefrom classes are instantiated through python's import utils. This means that if a class `c` is defined in...