torchtune
torchtune copied to clipboard
update eval wrapper to match lm_eval's api change
Context
What is the purpose of this PR? Is it to
- [ ] add a new feature
- [x] fix a bug
- [ ] update tests and/or documentation
- [ ] other (please add here)
Please link to any issues this PR addresses. None since I did not create an issue.
Changelog
What are the changes made in this PR?
This is a modification to the evaluation script. A recent change in lm_eval removed the default value for the pertained
argument in their HFLM
class. As a result, directly running the current version of torchtune
will fail due to the missing argument in __init__
. In the long term, we should consider modifying this script to comply with their changes.
The current change uses the previous default value for this argument, its value won't affect functionality since self._model
is overwritten by our model anyway. Alternatively, you could pass None
or the model directly to this argument. However, lm_eval
will issue a warning whenever this argument is not a string.
If you have better ideas or suggestions regarding this issue, please feel free to close this PR and create a new one with your proposed changes.
Test plan
Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help.)
- [ ] run pre-commit hooks and linters (make sure you've first installed via
pre-commit install
) - [ ] add unit tests for any new functionality
- [ ] update docstrings for any new or updated methods or classes
- [ ] run unit tests via
pytest tests
- [ ] run recipe tests via
pytest tests -m integration_test
- [ ] manually run any new or modified recipes with sufficient proof of correctness
- [ ] include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/950
- :page_facing_up: Preview Python docs built from this PR
Note: Links to docs will display an error until the docs builds have been completed.
:white_check_mark: No Failures
As of commit d591b3ed6fa315339517c8903beeea2ff4d8ba4b with merge base fa1392b08598202971a3afd291ec50bc35ceb342 ():
:green_heart: Looks good so far! There are no failures yet. :green_heart:
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Hi @water-vapor!
Thank you for your pull request and welcome to our community.
Action Required
In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.
Process
In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.
Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed
. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.
If you have received this in error or have any questions, please contact us at [email protected]. Thanks!
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!
Thanks for this quick fix @water-vapor !
Can you post the output of a run with this updated change for posterity?
Sure. For the llama2 evaluation in the doc's example, the outputs after this fix is:
2024-05-08:02:54:24,677 INFO [_utils.py:34] Running EleutherEvalRecipe with resolved config:
checkpointer:
_component_: torchtune.utils.FullModelHFCheckpointer
checkpoint_dir: /tmp/Llama-2-7b-hf
checkpoint_files:
- pytorch_model-00001-of-00002.bin
- pytorch_model-00002-of-00002.bin
model_type: LLAMA2
output_dir: /tmp/Llama-2-7b-hf
recipe_checkpoint: null
device: cuda
dtype: bf16
limit: null
max_seq_length: 4096
model:
_component_: torchtune.models.llama2.llama2_7b
quantizer: null
seed: 217
tasks:
- truthfulqa_mc2
tokenizer:
_component_: torchtune.models.llama2.llama2_tokenizer
path: /tmp/Llama-2-7b-hf/tokenizer.model
2024-05-08:02:54:24,794 DEBUG [seed.py:59] Setting manual seed to local seed 217. Local seed is seed + rank = 217 + 0
2024-05-08:02:54:29,036 INFO [eleuther_eval.py:168] Model is initialized with precision torch.bfloat16.
2024-05-08:02:54:29,045 INFO [eleuther_eval.py:152] Tokenizer is initialized from file.
2024-05-08:02:54:29,332 WARNING [logging.py:61] Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is re
commended to upgrade the kernel to the minimum version or higher.
2024-05-08:02:54:29,332 INFO [huggingface.py:165] Using device 'cuda:0'
/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed
in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
2024-05-08:02:54:35,132 INFO [eleuther_eval.py:189] Running evaluation on ['truthfulqa_mc2'] tasks.
2024-05-08:02:54:35,133 INFO [task.py:398] Building contexts for truthfulqa_mc2 on rank 0...
100%|███████████████████████████████████████████████████████████████████████████| 817/817 [00:00<00:00, 1200.52it/s]
2024-05-08:02:54:35,848 INFO [evaluator.py:395] Running loglikelihood requests
Running loglikelihood requests: 100%|███████████████████████████████████████████| 5882/5882 [03:15<00:00, 30.08it/s]
2024-05-08:02:57:54,520 INFO [eleuther_eval.py:196] Eval completed in 205.23 seconds.
2024-05-08:02:57:54,520 INFO [eleuther_eval.py:198] truthfulqa_mc2: {'acc,none': 0.3891973334302967, 'acc_stderr,none': 0.013567843287741146, 'alias': 'truthfulqa_mc2'}
Before the fix:
2024-05-08:01:29:22,962 DEBUG [seed.py:59] Setting manual seed to local seed 217. Local seed is seed + rank = 217 + 0
2024-05-08:01:29:25,098 INFO [eleuther_eval.py:168] Model is initialized with precision torch.bfloat16.
2024-05-08:01:29:25,106 INFO [eleuther_eval.py:152] Tokenizer is initialized from file.
Traceback (most recent call last):
File "/home/[redacted]/miniconda3/envs/torchtune/bin/tune", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/torchtune/_cli/tune.py", line 49, in main
parser.run(args)
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/torchtune/_cli/tune.py", line 43, in run
args.func(args)
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/torchtune/_cli/run.py", line 179, in _run_cmd
self._run_single_device(args)
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/torchtune/_cli/run.py", line 93, in _run_single_device
runpy.run_path(str(args.recipe), run_name="__main__")
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/recipes/eleuther_eval.py", line 211, in <module>
sys.exit(recipe_main())
^^^^^^^^^^^^^
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/torchtune/config/_parse.py", line 50, in wrapper
sys.exit(recipe_main(conf))
^^^^^^^^^^^^^^^^^
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/recipes/eleuther_eval.py", line 207, in recipe_main
recipe.evaluate()
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/recipes/eleuther_eval.py", line 175, in evaluate
model_eval_wrapper = _EvalWrapper(
^^^^^^^^^^^^^
File "/home/[redacted]/miniconda3/envs/torchtune/lib/python3.11/site-packages/recipes/eleuther_eval.py", line 58, in __init__
super().__init__(device=str(device))
TypeError: HFLM.__init__() missing 1 required positional argument: 'pretrained'
I still get the error. Is it is fixed in the main? I have installed it by running pip install torchtune
. However if I install it by cloning the git repo there I don't get this error.
Hi @sambit19 if you are installing the stable version of torchtune it will not include these changes, as they landed after our 0.1 release. But installing with git clone should work. Alternatively, you can install a nightly version of torchtune by following the instructions here.