lumliolum
lumliolum
Authors have sweeped the parameters for the tasks (you can check Table 7 of the paper). Hope it helps.
Do we have the functionality now in category encoders?
I am using tqdm_notebook, i not getting any progress bar, but instead getting HBox(children=(IntProgress(value=0, max=3), HTML(value=''))) Dont know what's the reason?
Hello, I do have some more questions continuing on what was mentioned before In the `NormSelect` function https://github.com/sIncerass/powernorm/blob/9ea6226a3203d5d6fcee07a5c6dec38ec6bc5e9f/fairseq/modules/norm_select.py#L12-L19 for batch norm we are using `MaskSyncBatchNorm` : version of Sync Batch...
The discussions around groupscaling are given here : #9, #8
Hey @Ice-Citron Thanks for sharing the code for `SyncPowerNorm`. Currently I don't have acess to the server (with multiple GPU's) to run your code. I will get it back after...
> Only realised just now that this normalisation layer is meant for ViTs instead of language transformers But in the paper, they were using this norm layer for machine translation...