unitxt icon indicating copy to clipboard operation
unitxt copied to clipboard

A replacement is needed for dataset lmsys/arena-hard-browser that was gone from HF

Open dafnapension opened this issue 7 months ago • 2 comments

'lmsys/arena-hard-browser' was gone from HF. test_preparation does not catch faulty prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py that tries to access this (gone) dataset, because a missing dataset is considered by test_preparation an error to be ignored.
That card participates in performance test, which does not ignore missing dataset. For that participation in performance-test, the card was fixed to use a similar dataset read from a github repository. The fix has been committed to main, the PR is: https://github.com/IBM/unitxt/pull/1757

However, additional prepare-files exist that generate cards that look (in vain) for 'lmsys/arena-hard-browser': prepare/cards/arena_hard/response_assessment/pairwise_comparative_rating/both_games_gpt4_judge.py prepare/cards/arena_hard/response_assessment/pairwise_comparative_rating/both_games_mean_judgment_gpt4_judge.py prepare/cards/arena_hard/response_assessment/pairwise_comparative_rating/first_game_only_gpt4_judge.py

A modification of the above prepare files is needed, in the spirit of the above mentioned PR.

Among others, a mapping between the data-files mentioned in the (currently faulty) prepares files and the data-files found in the github repository is not clear for these prepare-files, as they use * in its paths, for example.

dafnapension avatar May 07 '25 13:05 dafnapension

It seems the data is moved here: https://huggingface.co/datasets/lmarena-ai/arena-hard-auto/tree/main/data/arena-hard-v0.1

Please see if the file structure is the same and if we can just point the card to the new directory

OfirArviv avatar May 08 '25 08:05 OfirArviv

Nice catch! (douze points as they say today..) It is a new dataset, younger than 20 day old! For prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py, just returning to the older version of LoadFromHFSpace (with updated paths, of course)

Yielded:
[Unitxt|CRITICAL|test_preparation.py:89] 2025-05-08 11:03:31,604 >> Testing preparation file: /home/runner/work/unitxt/unitxt/prepare/cards/arena_hard/generation/english_gpt-4-0314_reference.py failed with ignored error: The Huggingface space 'lmarena-ai/arena-hard-auto' was not found. Please check if the name is correct and you have access to the space.

and changing to LoadHF yielded: , raise DatasetNotFoundError(f"Dataset '{path}' doesn't exist on the Hub or cannot be accessed.") from e datasets.exceptions.DatasetNotFoundError: Dataset 'lmarena-ai' doesn't exist on the Hub or cannot be accessed

LoadHF with path="lmarena-ai/arena-hard-auto" finally made it, but then the fields do not match. So I changed them per the previous version that I prepared for the github version. Now test_performance, which does consume the card, fails. The same error that is reported on the HF site when looking why a dataset viewer is not available.

I am stuck.

dafnapension avatar May 08 '25 11:05 dafnapension

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jun 08 '25 03:06 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jul 10 '25 03:07 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Jul 24 '25 03:07 github-actions[bot]