Cross-Scale-Non-Local-Attention
Cross-Scale-Non-Local-Attention copied to clipboard
model size and parameter
Hello, how were the model sizes and parameters mentioned in the paper table2 calculated? How long was the training time of the model?
Hi, the x2 model size in the paper is calculated using the following code in main.py: print('Total params: %.2fM' % (sum(p.numel() for p in _model.parameters())/1000000.0))
Training time depends on certain computation resources. It will roughly take about 5~6 days on 4 V100 GPUs.