Less Wright
Less Wright
Hi @bratao - it would still make sense to use, but my recommendation is to run with MABN - moving average batch norm. This creates a moving average across batches...
Their code is linked there though it needs to likely be extractedout of their framework as I recall. Anyway it's on my todo list and maybe can pull it out...
I'm currently 3/4 done with a rewrite of Ranger (Ranger21) to include a number of new innovations since Ranger was released. I will make that a pip package once done...
Hi @pablogps, Excellent recommendations and feedback. I will update the code per your suggestions and check it back in. Thanks again!
@hiyyg - please try it now. I ran ranger.py through pylint via vscode and had a bunch of tweaks that should resolve this. Let me know if not! Thanks
Hi @hiyyg and @neuronflow, Right now you can turn off the built in lr scheduling by turning off both warmup and warmdown: `use_warmup=False warmdown_active=False` that should simply pass through the...
Hi @fmellomascarenhas, @neuronflow and @hiyyg - fully agree with all the points above (decoupled scheduler and parameter groups. This split between scheduler and optimizer will happen for Ranger22 (the 2022...
I'm working with DETR which is object detection with transformer internally and will test it out there soon. Note that Ranger now has GC (gradient centralization) and will be interesting...
Hi @neuronflow, @saruarlive is correct - the issue is we need to know how many epochs and how many iterations per epoch in order to auto-compute the lr schedule. Clearly...
Hi @neuronflow, The valueerror above comes from having 4 or more dimensions ala 3D convolutions. If you pull the latest version that I posted last week then it adaptive clipping...