axon Default loop output transforms are too intrusive

Default loop output transforms are too intrusive

Open seanmor5 opened this issue 3 years ago • 1 comments

trafficstars

For supervised training loops, we provide a convenience output transform which ensures only the model state is returned from the training loop. This means you always lose the entire training state, which might be of interest later on.

I propose instead that we return a tuple:

{loop_state, transformed_state} which always returns the whole state, as well as a transformed version. That way you never accidentally lose the entire state.

Sep 17 '22 15:09 seanmor5

My suggestion is to get rid of output_transform altogether. After all, anyone can transform the output by piping an operation after it.

:+1: for the trainer returning {model_state, loop_state} though. The user can even access other metadata inside state.step_state. If you want, you can even add more structure by defining a TrainerStep struct which you then place it as the step_state.

Perhaps it is best to do these changes sooner than later, since they are breaking?

Dec 12 '23 08:12 josevalim

axon axon copied to clipboard

Default loop output transforms are too intrusive

axon
axon copied to clipboard