pythia icon indicating copy to clipboard operation
pythia copied to clipboard

"gas" configuration doesn't do anything

Open segyges opened this issue 1 year ago • 0 comments

Per this, my understanding is that the gas config in neox doesn't do anything, and shouldn't be used, and should be removed. We should be using gradient_accumulation_steps instead.

It appears that all existing pythia configs set gas to 1, which is the default for gradient_accumulation_steps anyway, so this will not matter. Per that same search some of the old eval results specifically show gas at 2, which would be a bad error and would halve effective batch size if the expectation was that gas did something.

I am not putting in a PR to replace gas with gradient_accumulation_steps because these configs are references for the settings of existing artifacts, so it's not clear to me that they should be fixed to be "correct", or if they are, what the correct steps would be to make sure that they're preserved as references on those artifacts if the configuration is fixed going forward.

segyges avatar Feb 04 '24 21:02 segyges