Lux.jl icon indicating copy to clipboard operation
Lux.jl copied to clipboard

Improve `Julia & Lux for the uninitiated`

Open claforte opened this issue 2 years ago • 1 comments

Hi, congrats on a very interesting package, I look forward to trying it out! I'm going through the docs and noticed some typos. I also recommend small potential improvements. I couldn't easily identify the original files to do a PR, so here they are:

In http://lux.csail.mit.edu/dev/examples/generated/beginner/Basics/main/:

  • we don't enfore it -> we don't enforce it
  • We relu on the Julia StdLib -> We rely on the Julia StdLib
  • we create an PRNG and seed it -> we create a PRNG (pseudorandom number generator) and seed (initialize) it
  • we should use Lux.replicate on PRNG before using them -> we should use Lux.replicate on PRNGs before using them
  • provides an uniform API -> provides a uniform API
  • Note that AD.gradient will only work for scalar valued outputs -> Note that AD.gradient will only work for scalar valued outputs. (period at the end.)
  • to demonstrate Lux let us use the Dense layer. -> to demonstrate Lux, let's use the Dense layer. (Equivalent to Pytorch's nn.Linear)

In the same page, I recommend adding a line to make the following a bit more "user-friendly", e.g. for Pytorch users curious about Julia+Lux:

  • ∇f(x) = x:
    • add underneath: "∇" can be typed by \del<tab> in the Julia REPL or in a Julia-compatible editor. You can press ? in the REPL to enter Julia *help* mode, and, then paste the ∇, to find out how to type any unicode character in Julia.
  • For updating our parameters let's use [Optimisers.jl](https://github.com/FluxML/Optimisers.jl) -> To update our parameters, let's use from [Optimisers.jl](https://github.com/FluxML/Optimisers.jl) an SGD (Stochastic Gradient Descent) with learning rate set to 0.01:
  • Initialize the initial state of the optimiser -> Setup the initial state of the optimiser:
  • Define the loss function -> Define the loss function:
  • println("Loss Value with ground true W & b: ", mse(W, b, x_samples, y_samples)) -> println("Loss value evaluated with true parameters (weights and biases): ", mse(W, b, x_samples, y_samples))
  • # Perform parameter update -> # Update model's parameters:

IMHO the Jacobian-Vector Product and the Vector-Jacobian Product sections are technical details that's unlikely to be of interest to most people first looking at the docs... I recommend moving those section at the bottom of that page, or at least prefacing it with a "side-note: " so people can skip it.

claforte avatar Jul 20 '22 00:07 claforte

Thanks for the pointers. The file is here https://github.com/avik-pal/Lux.jl/tree/main/examples/Basics. (Sometime I will get around to writing how the docs are built to help contributors).

I agree with all but one change. The ∇f(x) needs to be updated to df(x) instead in-line with the style guide for lux.

avik-pal avatar Jul 20 '22 01:07 avik-pal