torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

Debug flag to quickly run and test configs

Open RdoubleA opened this issue 11 months ago • 3 comments

We should introduce a debug mode at the CLI level that will automatically run a config on CPU without distributed just for a small number of steps/epochs. This is really useful for quick sanity checks and testing purposes, and may be needed for recipe integration tests.

RdoubleA avatar Feb 29 '24 00:02 RdoubleA

should this just be --device cpu ?

NicolasHug avatar Feb 29 '24 09:02 NicolasHug

This will need some more work than just --device cpu since our current recipes have fsdp very tightly integrated. That said we have single device recipes which should land in the next week and those should work OOTB on CPU. cc: @rohan-varma, @ebsmothers

kartikayk avatar Feb 29 '24 16:02 kartikayk

It will be good to have a debug flag for both CPU and GPU for sanity testing the setup is working fine for the different type of configs

chauhang avatar Mar 23 '24 18:03 chauhang