dlwpt-code icon indicating copy to clipboard operation
dlwpt-code copied to clipboard

p1ch8: "ResBlock" objects are most likely identical

Open aallahyar opened this issue 3 years ago • 2 comments

In the scripts of p1ch8 (section 8.5.3, page 227 to be specific), we are making sub-blocks of convolution using the following code:

self.resblocks = nn.Sequential(
            *(n_blocks * [ResBlock(n_chans=n_chans1)]))

However, considering that objects are copied by reference, then I can imagine that the weight matrices across created ResBlocks are identical.

I think the code should be changed to:

self.resblocks = nn.Sequential(
            *[ResBlock(n_chans=n_chans1) for _ in range(n_blocks)])

aallahyar avatar May 10 '22 18:05 aallahyar

Absolutely, thank you for spotting this and reporting.

t-vi avatar May 10 '22 20:05 t-vi

@aallahyar Yes.

Here I add an example for more readers:

import torch
import torch.nn as nn

class NetResDeep(nn.Module):
    def __init__(self, n_chans1=32, n_blocks=10):
        super().__init__()
        self.n_chans1 = n_chans1
        self.conv1 = nn.Conv2d(3, n_chans1, kernel_size=3, padding=1)
        self.resblocks = nn.Sequential(*([ResBlock(n_chans=n_chans1)] * n_blocks))    # shown in the book
        #self.resblocks = nn.Sequential(*[ResBlock(n_chans=n_chans1) for _ in range(n_blocks)])    # the right version
        self.fc1 = nn.Linear(n_chans1 * 8 * 8, 32)
        self.fc2 = nn.Linear(32, 2)
    
    def forward(self, x):
        out = F.max_pool2d(torch.relu(self.conv1(x)), 2)
        out = F.max_pool2d(self.resblocks(out))
        out = out.view(-1, self.n_chans1 * 8 * 8)
        out = torch.relu(self.fc1(out))
        out = self.fc2(out)
        return out

netresdeep = NetResDeep()
id(netresdeep.resblocks[0].conv.weight) == id(netresdeep.resblocks[1].conv.weight)

Output:

True

This result shows that these two sets of weights share the same memory.

ftianRF avatar May 19 '22 07:05 ftianRF