Feedback about What is torch.nn really?

Open bogpetre opened this issue 1 month ago • 0 comments

There is the following issue on this page: https://docs.pytorch.org/tutorials/beginner/nn_tutorial.html

In the section "Neural net from scratch (without torch.nn)" there is a pre-training loss function evaluation on a batch of 64 instances,

yb = y_train[0:bs]
print(loss_func(preds, yb))

then training is performed (comments my own)

for epoch in range(epochs):
    for i in range((n - 1) // bs + 1):
        #         set_trace()
        start_i = i * bs
        end_i = start_i + bs
        xb = x_train[start_i:end_i] # note that xb gets redefined
        yb = y_train[start_i:end_i] # note that yb gets redefined
        pred = model(xb)
        loss = loss_func(pred, yb)

        loss.backward()
        with torch.no_grad():
            weights -= weights.grad * lr
            bias -= bias.grad * lr
            weights.grad.zero_()
            bias.grad.zero_()

and the loss function is evaluated again to demonstrate a reduction in loss.

print(loss_func(model(xb), yb), accuracy(model(xb), yb))

The final evaluation is not applied to the same data as the first though. Both invoke xb and yb, but xb and yb in the pre-training evaluation are the first 64 instances from the set, during training these variables are updated with subsequent batches and the final evaluation is performed on the final batch.

Pre and post-training evaluations should be performed on the same batch, either the original 64 instances from the first training batch, or (if the intent is to demonstrate generalization loss) the test dataset.

cc @albanD @jbschlosser

Nov 26 '25 21:11 bogpetre