Flux.jl icon indicating copy to clipboard operation
Flux.jl copied to clipboard

Some small printing upgrades

Open mcabbott opened this issue 2 years ago • 5 comments

This is intended for https://github.com/FluxML/Fluxperimental.jl/pull/20 but probably a good idea anyway. Motivating example is something like this (where bigger arrays will print pages of numbers):

julia> struct Tmp2; x; y; end; Flux.@functor Tmp2

julia> Chain(Tmp2([Dense(2,3), randn(3,4)'], (x=1:3, y=Dense(3,4), z=rand(3))))
Chain(
  Tmp2(
    Array(
      Dense(2 => 3),                    # 9 parameters
      [0.351978391016603 0.6408681372462821 -1.326533184688648; 0.09481930831795712 
1.430103476272605 0.7250467613675332; 2.03372151428719 -0.015879812799495713 
1.9499692162118236; -1.6346846180722918 -0.8364610153059454 -1.2907265737483433],  # 12 parameters
    ),
    NamedTuple(
      1:3,                              # 3 parameters
      Dense(3 => 4),                    # 16 parameters
      [0.9666158193429335, 0.01613900990539574, 0.0205920186127464],  # 3 parameters
    ),
  ),
)                   # Total: 7 arrays, 43 parameters, 644 bytes.

Notice that Array() and NamedTuple() aren't actually valid syntax. Also that it prints whole arrays if they aren't inside a layer the way it expects. After this PR:

julia> Chain(Tmp2([Dense(2,3), randn(3,4)'], (x=1:3, y=Dense(3,4), z=rand(3))))
Chain(
  Tmp2(
    [
      Dense(2 => 3),                    # 9 parameters
      4×3 Adjoint,                      # 12 parameters
    ],
    (;
      x = 3-element UnitRange,          # 3 parameters
      y = Dense(3 => 4),                # 16 parameters
      z = 3-element Array,              # 3 parameters
    ),
  ),
)                   # Total: 7 arrays, 43 parameters, 644 bytes.

mcabbott avatar Oct 13 '23 12:10 mcabbott

Is having the summary instead of the actual contents important for @compact? If not, my preference would be to keep the more accurate printing but drop that part.

ToucheSir avatar Oct 14 '23 22:10 ToucheSir

The idea is that printing the values of a huge matrix isn't helpful, but showing the size is. This is what the present printing of e.g. Dense achieves.

But @compact seems to often want to store the arrays themselves without some layer.

(It's not exactly summary as this gets quite long for e.g. CuArrays)

mcabbott avatar Oct 14 '23 22:10 mcabbott

Revisiting this, it would be nice to retain some more info such as eltype if the contents aren't printed. Though it's far from perfect, could we avoid a lot of complexity by setting the :limit and :compact IOContext properties when showing these leaf nodes?

ToucheSir avatar Nov 10 '23 05:11 ToucheSir

Latest commit keeps eltype as suggested (or really, the first type parameter):

julia> struct Tmp2; x; y; end; Flux.@functor Tmp2

julia> Chain(Tmp2([Dense(2,3), randn(3,4)'], (x=1:3, y=Dense(3,4), z=rand(3))))
Chain(
  Tmp2(
    [
      Dense(2 => 3),                    # 9 parameters
      4×3 Adjoint{Float64,...},         # 12 parameters
    ],
    (;
      x = 3-element UnitRange{Int64},   # 3 parameters
      y = Dense(3 => 4),                # 16 parameters
      z = 3-element Vector{Float64},    # 3 parameters
    ),
  ),
)                   # Total: 7 arrays, 43 parameters, 780 bytes.

Note also that no model using existing layers is affected by this. The goal is mainly to make @compact layers print more like the existing ones, by describing their parameter arrays, not printing the actual numbers.

mcabbott avatar Nov 28 '23 14:11 mcabbott

We were discussing this during our call this week. The consensus was that using IOContext(..., :limit => true) is preferable to a custom summary for arrays. This would keep things succinct while still matching Base standards/expectations.

There are times when the type (e.g. Adjoint) is desired. This case might be better addressed by offering something like Flax's tabulate.

darsnack avatar Dec 01 '23 17:12 darsnack