Flux.jl icon indicating copy to clipboard operation
Flux.jl copied to clipboard

Given that DataLoader implements `length` shouldn't it also be able to provide size?

Open FelixBenning opened this issue 1 year ago • 4 comments

https://github.com/FluxML/Flux.jl/blob/aff2f9e5aaaed36282a02a4032c787b999513da2/src/data/dataloader.jl#L100-L103

ERROR: MethodError: no method matching size(::MLUtils.DataLoader{Tuple{Array{Float32, 3}, OneHotArrays.OneHotMatrix{UInt32, Vector{UInt32}}}, Random._GLOBAL_RNG, Val{nothing}})
Closest candidates are:
  size(::Union{LinearAlgebra.QR, LinearAlgebra.QRCompactWY, LinearAlgebra.QRPivoted}) at /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/qr.jl:581
  size(::Union{LinearAlgebra.QR, LinearAlgebra.QRCompactWY, LinearAlgebra.QRPivoted}, ::Integer) at /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/qr.jl:580
  size(::Union{LinearAlgebra.Cholesky, LinearAlgebra.CholeskyPivoted}) at /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:514

The size is requested by the @progress macro of ProgressLogging.jl

FelixBenning avatar Jan 25 '24 14:01 FelixBenning

Hey, I would like to work on it. If I am understanding it correctly there should be a function Base.size(d::DataLoader) that returns the same thing as length right?

kettogg avatar Jan 28 '24 13:01 kettogg

size is defined to return a tuple, instead of a single int, so almost. I'd have a look at the docs/REPL help for size and try it out on a couple of different AbstractVector subtypes to get a feel for how it should be implemented.

ToucheSir avatar Jan 28 '24 18:01 ToucheSir

Note BTW that the code for this lives in MLUtils.jl, not in the Flux.jl repository: https://github.com/JuliaML/MLUtils.jl/blob/main/src/eachobs.jl#L186

Should be an easy PR to add this, Base.size(d::DataLoader) = (length(d),) and a test.

At least if I'm thinking correctly. The size should always be the size of collect(d), and this will always be a 1-dimensional container, Vector, I think.

mcabbott avatar Jan 30 '24 23:01 mcabbott

Where should the tests be added? Should it be added in https://github.com/JuliaML/MLUtils.jl/blob/main/test/dataloader.jl?

kettogg avatar Feb 07 '24 17:02 kettogg