llama-models icon indicating copy to clipboard operation
llama-models copied to clipboard

refactor: make llama3 generation closer to llama4

Open ashwinb opened this issue 8 months ago • 0 comments

Make the generators simpler and closer to each other. There is a ton of duplicated code which needs to be removed.

Test Plan

Run all variants of the matrix:

  • MODEL in (llama3, llama4)
  • QUANT in (none, fp8_mixed, int4_mixed)
NGPUS=1
MODEL=llama3
QUANT=fp8_mixed
CHECKPOINT_DIR=~/.llama/checkpoints/Llama3.2-11B-Vision-Instruct/
torchrun --nproc-per-node=$NGPUS -m models.$MODEL.scripts.completion \
   $CHECKPOINT_DIR  --world_size $NGPUS --quantization_mode $QUANT

ashwinb avatar Apr 07 '25 00:04 ashwinb