A replacement is needed for dataset lmsys/arena-hard-browser that was gone from HF
'lmsys/arena-hard-browser' was gone from HF. test_preparation does not catch faulty prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py that tries to access this (gone) dataset, because a missing dataset is considered by test_preparation an error to be ignored.
That card participates in performance test, which does not ignore missing dataset. For that participation in performance-test, the card was fixed to use a similar dataset read from a github repository. The fix has been committed to main, the PR is: https://github.com/IBM/unitxt/pull/1757
However, additional prepare-files exist that generate cards that look (in vain) for 'lmsys/arena-hard-browser':
prepare/cards/arena_hard/response_assessment/pairwise_comparative_rating/both_games_gpt4_judge.py
prepare/cards/arena_hard/response_assessment/pairwise_comparative_rating/both_games_mean_judgment_gpt4_judge.py
prepare/cards/arena_hard/response_assessment/pairwise_comparative_rating/first_game_only_gpt4_judge.py
A modification of the above prepare files is needed, in the spirit of the above mentioned PR.
Among others, a mapping between the data-files mentioned in the (currently faulty) prepares files and the data-files found in the github repository is not clear for these prepare-files, as they use * in its paths, for example.
It seems the data is moved here: https://huggingface.co/datasets/lmarena-ai/arena-hard-auto/tree/main/data/arena-hard-v0.1
Please see if the file structure is the same and if we can just point the card to the new directory
Nice catch! (douze points as they say today..) It is a new dataset, younger than 20 day old!
For prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py, just returning to the older version of LoadFromHFSpace (with updated paths, of course)
Yielded:
[Unitxt|CRITICAL|test_preparation.py:89] 2025-05-08 11:03:31,604 >> Testing preparation file: /home/runner/work/unitxt/unitxt/prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py failed with ignored error: The Huggingface space 'lmarena-ai/arena-hard-auto' was not found. Please check if the name is correct and you have access to the space.
and changing to LoadHF yielded: ,
raise DatasetNotFoundError(f"Dataset '{path}' doesn't exist on the Hub or cannot be accessed.") from e datasets.exceptions.DatasetNotFoundError: Dataset 'lmarena-ai' doesn't exist on the Hub or cannot be accessed
LoadHF with path="lmarena-ai/arena-hard-auto" finally made it, but then the fields do not match. So I changed them per the previous version that I prepared for the github version. Now test_performance, which does consume the card, fails. The same error that is reported on the HF site when looking why a dataset viewer is not available.
This issue is stale because it has been open for 30 days with no activity.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.