lightning-thunder icon indicating copy to clipboard operation
lightning-thunder copied to clipboard

Automated Hugging Face transformers coverage testing

Open t-vi opened this issue 7 months ago • 3 comments

We want to systematically assess where we are w.r.t. covering HF transformers cases.

  • we want automated e2e tests, maybe you can re-use some of #1984 for how to launch them automatically.
  • we want to use the latest version of transformers,
  • initially we only run foward, later we need to test both inference (incl static kvcache) and training (fw + bw),
  • This is for testing + coverage reporting, not for fixing nor performance!
  • Let's start with 10-ish models. (there is a list internally, but I don't know if it is public, yet)

Afterwards, we want to file issues for errors we hit and also look at whether the error messages are OK or need improvement.

Please reach out to Teja Pulagam (OSS contributor) to collaborate.

You could use the initial phases of the "decomposed" jit:

https://github.com/Lightning-AI/lightning-thunder/blob/3aa706a92092738e24a3f9dc08c96743646105e5/thunder/tests/test_core.py#L3216

t-vi avatar May 01 '25 17:05 t-vi

We have a test for Hugging Face CausalLM models (test_hf_for_nemo it should be renamed to just test_hf maybe), but unfortunately, it's skipped for two months (https://github.com/Lightning-AI/lightning-thunder/commit/a6698fafc0fe652801312860d8d86bf3322f4f6b by @k223kim).

The test could be updated to add the parametrization to use thunder.jit in addition to thunderfx.

Currently, only two models are there. Here's the list of models we could enable (all of them do not require Hugging Face token): https://gist.github.com/IvanYashchuk/a16fe212ef4c9e42a648049790543f42.

IvanYashchuk avatar May 02 '25 13:05 IvanYashchuk

Yes that's definitely the goal @IvanYashchuk, thanks for sharing the list and pointing to test_hf_for_nemo. Now that we have the Studio-backed CI job up and running (daily), we have a way to add coverage as well on the proper machines without disrupting CI.

The additional thing we were thinking about for thunder.jit only, as @t-vi mentioned at the end of the issue description, is to run acquisition on the meta device alone, which can be ran fully on CPU and will allow us to selectively test acquisition.

lantiga avatar May 05 '25 14:05 lantiga

it's skipped for two months

This is tracked as #1920 I will comment there.

t-vi avatar May 13 '25 11:05 t-vi

I believe we can close due to https://github.com/Lightning-AI/lightning-thunder/pull/2281 @t-vi @lantiga

KaelanDt avatar Jul 24 '25 09:07 KaelanDt