Proyag Pal comments

Repositories
Issues
Comments

Results 4 comments of


                                            Proyag Pal

Questions about logged throughput metrics

Except setting the number of GPUs where relevant, the configs are exactly the same, yes. It's adapted a bit from some default recipes, but I didn't add any callbacks. Let...

Questions about logged throughput metrics

I adapted it from [this recipe](https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/llm/recipes/hf_auto_model_for_causal_lm.py), but I'm running it directly as a script in the `nvcr.io/nvidia/nemo:dev` container. Here's a version of the script that has the same logging issues...

Questions about logged throughput metrics

I was using [OLMo2](allenai/OLMo-2-0425-1B)

Questions about logged throughput metrics

Ah ok, thanks for checking. * But when changing from 4 to 8 GPUs, why is `tps` increasing and `it/s` decreasing if they are per GPU? * Also, here I...