💡 [REQUEST] - Clarification of requires_grad about beginner/nn_tutorial.html

Open FlightVin opened this issue 2 weeks ago • 0 comments

🚀 Describe the improvement or the new tutorial

It was challenging for me to initially grasp why requires_grad was done after weights, but in the same line as bias under https://docs.pytorch.org/tutorials/beginner/nn_tutorial.html#neural-net-from-scratch-without-torch-nn

At first glance, the code looks inconsistent:

weights initialization is split into two lines.
bias initialization is done in one line.

The Logic Gap The tutorial currently explains that we do it, but not exactly why the distinction exists between these two specific variables.

The Bias is created using a factory function (torch.zeros) with no subsequent mathematical operations. It is born as a "Leaf Node" (a source parameter).
The Weights involve a mathematical operation (/ math.sqrt(...)). If we set requires_grad=True inside torch.randn(), PyTorch records the division as a computational step. The resulting weights variable becomes a non-leaf node (a calculated outcome), which the optimizer cannot update.

Proposed Improvement I propose modifying the comment block to explicitly mention that requires_grad must be deferred until after the initialization math is complete to preserve the tensor as a trainable parameter (Leaf Node).

Existing tutorials on this topic

https://docs.pytorch.org/tutorials/beginner/nn_tutorial.html
https://docs.pytorch.org/tutorials/beginner/nn_tutorial.html#neural-net-from-scratch-without-torch-nn

Additional context

Jan 04 '26 10:01 FlightVin