torchnnprofiler icon indicating copy to clipboard operation
torchnnprofiler copied to clipboard

why?

Open ld-xy opened this issue 2 years ago • 5 comments

ResNet( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (maxpool): MaxPool2d() (layer1): Sequential( (0): BasicBlock( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d() (bn2): BatchNorm2d() ) (1): BasicBlock( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d() (bn2): BatchNorm2d() ) ) (layer2): Sequential( (0): BasicBlock( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d() (bn2): BatchNorm2d() (downsample): Sequential( (0): Conv2d() (1): BatchNorm2d() ) ) (1): BasicBlock( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d() (bn2): BatchNorm2d() ) ) (layer3): Sequential( (0): BasicBlock( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d(Forward Time: 15.625000ms | Backward Time: 15.625000ms) (bn2): BatchNorm2d() (downsample): Sequential( (0): Conv2d() (1): BatchNorm2d() ) ) (1): BasicBlock( (conv1): Conv2d(Forward Time: 15.625000ms | Backward Time: 31.250000ms) (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d() (bn2): BatchNorm2d() ) ) (layer4): Sequential( (0): BasicBlock( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d() (bn2): BatchNorm2d() (downsample): Sequential( (0): Conv2d() (1): BatchNorm2d() ) ) (1): BasicBlock( (conv1): Conv2d() (bn1): BatchNorm2d() (relu): ReLU() (conv2): Conv2d() (bn2): BatchNorm2d() ) ) (avgpool): AdaptiveAvgPool2d() (fc): Linear() )

ld-xy avatar Dec 28 '22 08:12 ld-xy

Why do some measure the time and others don't?

ld-xy avatar Dec 28 '22 08:12 ld-xy

Thanks for opening the issue @ld-xy

I think this could be related to the note mentioned in the README

NOTE: We are unable to capture the timings for bn and RELU because of inplace operations either performed by the layer or following it.

Ref: https://github.com/pytorch/pytorch/issues/61519

Also, it would be great if you can share a small repro snippet. I can test and verify to see if maybe it is an actual bug in profiler.

Thanks :)

kshitij12345 avatar Dec 28 '22 08:12 kshitij12345

Thanks for opening the issue @ld-xy

I think this could be related to the note mentioned in the README

NOTE: We are unable to capture the timings for bn and RELU because of inplace operations either performed by the layer or following it.

Ref: https://github.com/pytorch/pytorch/issues/61519

Also, it would be great if you can share a small repro snippet. I can test and verify to see if maybe it is an actual bug in profiler.

Thanks :)

Hello, I am running the example resnet you provided in the example, but I did not get the results you provided

ld-xy avatar Dec 28 '22 09:12 ld-xy

Thank you for confirming. Will have a look soon. (Probably after New Year) :)

kshitij12345 avatar Dec 28 '22 09:12 kshitij12345

@ld-xy Sorry for the delayed reply, I am not able to reproduce this on latest PyTorch. Could you please tell me which version of PyTorch you are using. Thank you :)


ResNet(
  (conv1): Conv2d(Forward Time: 660.225743ms | Backward Time: 0.374010ms)
  (bn1): BatchNorm2d()
  (relu): ReLU()
  (maxpool): MaxPool2d(Forward Time: 80.600460ms | Backward Time: 41.181822ms)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(Forward Time: 465.398668ms | Backward Time: 767.773412ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 478.477239ms | Backward Time: 759.474148ms)
      (bn2): BatchNorm2d()
    )
    (1): BasicBlock(
      (conv1): Conv2d(Forward Time: 492.544588ms | Backward Time: 761.948107ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 481.007866ms | Backward Time: 809.356628ms)
      (bn2): BatchNorm2d()
    )
  )
  (layer2): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(Forward Time: 119.331581ms | Backward Time: 210.161808ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 259.967366ms | Backward Time: 467.255538ms)
      (bn2): BatchNorm2d()
      (downsample): Sequential(
        (0): Conv2d(Forward Time: 26.707087ms | Backward Time: 52.755344ms)
        (1): BatchNorm2d(Forward Time: 13.712976ms | Backward Time: 9.125574ms)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(Forward Time: 256.689748ms | Backward Time: 462.981328ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 268.672347ms | Backward Time: 474.130361ms)
      (bn2): BatchNorm2d()
    )
  )
  (layer3): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(Forward Time: 87.506342ms | Backward Time: 156.374294ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 145.203582ms | Backward Time: 272.365629ms)
      (bn2): BatchNorm2d()
      (downsample): Sequential(
        (0): Conv2d(Forward Time: 13.606398ms | Backward Time: 36.971666ms)
        (1): BatchNorm2d(Forward Time: 6.890035ms | Backward Time: 5.812464ms)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(Forward Time: 140.756974ms | Backward Time: 266.748421ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 140.667546ms | Backward Time: 281.216907ms)
      (bn2): BatchNorm2d()
    )
  )
  (layer4): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(Forward Time: 82.280094ms | Backward Time: 131.356780ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 170.941744ms | Backward Time: 250.199910ms)
      (bn2): BatchNorm2d()
      (downsample): Sequential(
        (0): Conv2d(Forward Time: 13.408525ms | Backward Time: 24.923987ms)
        (1): BatchNorm2d(Forward Time: 4.460331ms | Backward Time: 3.728276ms)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(Forward Time: 143.325021ms | Backward Time: 250.884337ms)
      (bn1): BatchNorm2d()
      (relu): ReLU()
      (conv2): Conv2d(Forward Time: 159.091152ms | Backward Time: 270.752946ms)
      (bn2): BatchNorm2d()
    )
  )
  (avgpool): AdaptiveAvgPool2d(Forward Time: 4.075216ms | Backward Time: 3.504959ms)
  (fc): Linear(Forward Time: 6.385692ms | Backward Time: 17.630702ms)
)

kshitij12345 avatar Jan 11 '23 14:01 kshitij12345