fix model names
This PR fixes the model name problems existing in Qwen2 related codes and docs
Thanks a lot! @JustinLin610
I will check with our CI runners and come back to you.
Running on our runner (T4)
RUN_SLOW=1 TF_FORCE_GPU_ALLOW_GROWTH=yes python3 -m pytest -v tests/models/qwen2/
Repo id issue
FAILED tests/models/qwen2/test_tokenization_qwen2.py::Qwen2TokenizationTest::test_tokenizer_integration - OSError: Can't load tokenizer for 'Qwen/Qwen1.5-7B'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'Qwe...
Results not matching expected values
(we can simply update the expected values if you think it's the way to go - as long as the model still behaves correctly)
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_model_450m_logits - AssertionError: Tensor-likes are not close!
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_model_450m_long_prompt_sdpa - AssertionError: 'My favourite condiment is 100% ketchup. I love it on everything. I’m not a big' != 'My favourite condiment is ____(醋).\n根据提示"醋"可知,这里is单数,主语填'
GPU OOM: maybe use shorter output lengths?
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_model_450m_generation - torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 594.00 MiB. GPU 0 has a total capacity of 14.76 GiB of which 224.75 MiB is free. Process 2822 has 14.53 GiB memory in use. Of the allocated me...
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2IntegrationTest::test_speculative_generation - torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 14.76 GiB of which
6.75 MiB is free. Process 2822 has 14.75 GiB memory in use. Of the allocated memor...
SDPA issue: this might be tricky
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_generate - AssertionError: False is not true
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_0_float16 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 4.090e+00, torch atol = 0.005, torch rtol = 0.005
FAILED tests/models/qwen2/test_modeling_qwen2.py::Qwen2ModelTest::test_eager_matches_sdpa_inference_2_float32 - AssertionError: False is not true : padding_side=left, use_mask=False, batch_size=1, enable_kernels=False: mean relative difference: 2.817e+00, torch atol = 1e-06, torch rtol = 0.0001
For
check_code_quality
let's resolve it toward the end of the PR (before we merge to main)
For
check_code_quality
let's resolve it toward the end of the PR (before we merge to main)
I think we only have code quality issue based on the test? The reported issues are from your internal CI tests right? I tested manually and fixed the mentioned problem. For sdpa, I did not run into issues btw.
https://huggingface.co/Qwen/Qwen1.5-7B this is the repo id. Anyway I'll switch to Qwen/Qwen1.5-0.5B for the consistency.
@ydshieh feel free to send me feedback 🚀
Hi @JustinLin610 Thank you for the updating again. OK, I will take care of them, but could you share on which GPU you ran the tests? (You ran with RUN_SLOW=1 right?)
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Hi @JustinLin610 Thank you for the updating again. OK, I will take care of them, but could you share on which GPU you ran the tests? (You ran with
RUN_SLOW=1right?)
I run with A100 80G, python 3.12, pytorch 2.2. You mean I run the eval with export RUN_SLOW=1 for the environment? No I didn't.
The integration tests (i.e. Qwen2IntegrationTest) would be run only with export RUN_SLOW=1. Sorry if I didn't make it clear. If you can run it again and see if there is/are something you can help us to fix, that would be much appreciated: this means to make sure the tests pass in you run (with export RUN_SLOW=1) .
(And if something is only failing on our T4, I could fix them on my side.)
Although it's just internal CI, it's an important part of the ecosystem that allow us to monitor if something is broken by newly merged PRs. I believe Qwen2 would benefit from haveing a working Qwen2IntegrationTest 🤗 .
@JustinLin610 Could you run the following on your machine
RUN_SLOW=1 python -m pytest -v tests/models/qwen2/test_modeling_qwen2.py
and share the logs. Let's see how it goes, fix whatever could be done on your side, merge it and I will take care of the rest (if any) 🚀
Thank you in advance 🙏