brulee
brulee copied to clipboard
batch proportion instead of batch size
This would be much easier to tune
I think we usually tune the batch_size to be small as possible while making good estimates of the gradient and that shouldn't depend on the dataset size.
Maybe we could accept both integers and fractions and just interpret (0-1] as fractions and >1 as batch_size?