lightning-thunder icon indicating copy to clipboard operation
lightning-thunder copied to clipboard

Thunder and ThunderFX are slower than torch.compile for FP8 and falcon-7b and other models

Open mpatel31415 opened this issue 1 year ago • 3 comments

🐛 Bug

As can be seen below Thunder is slower than torch.compile for single gpu training of falcon-7b:

image

Below are results for ThunderFX for multi-gpu training : image

Batch sizes and sharding modes doesn't match, but these are the fastest options for ThunderFX:

  • For the first row for micro batch size the same as torch.compile (6) we get even lower throughput - 46.19
  • For the second row for micro batch size 7 and sharing mode zero3 we get throughput 93.5.

To Reproduce

Steps to reproduce the behavior:

python /opt/pytorch/lightning-thunder/thunder/benchmarks/benchmark_litgpt.py \
    --model_name falcon-7b \
    --compile thunder \
    --low_precision_mode fp8-delayed-te  \
    --micro_batch_size 1

Expected behavior

Thunder should be as fast as torch.compile.

Environment

system.device_product_name DGXH100 system.gpu_driver_version 535.129.03 libraries.cuda 12.6.2.004 libraries.pip.lightning 2.4.0.dev20240728 libraries.pip.lightning-thunder 0.2.0.dev0 libraries.pip.lightning-utilities 0.11.8 libraries.pip.litgpt 0.4.11 libraries.pip.nvfuser 0.2.20+git85c22a2 libraries.pip.pytorch-lightning 2.4.0 libraries.pip.torch 2.6.0a0+git96b30dc libraries.pip.torchmetrics 1.5.1 libraries.pip.torchvision 0.19.0a0+d23a6e1

mpatel31415 avatar Oct 30 '24 08:10 mpatel31415

Actually we see the same results for other models. Is one issue enough to track all of them? Below are the results:

image

mpatel31415 avatar Oct 30 '24 08:10 mpatel31415

Is one issue enough to track all of them?

Given the scale and universality, yeah, I think one issue is enough for all models.

tfogal avatar Nov 01 '24 18:11 tfogal

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 16 '25 06:04 stale[bot]