Flux.jl Add option to throw error on passing wrong precision floats to layers

Motivation and description

The warning about wrong precision is very helpful to point at potential performance issues

https://github.com/FluxML/Flux.jl/blob/2f19e681040516db9803111819fcfaf5b44f07eb/src/layers/stateless.jl#L60

I think that this is the correct default behavior. However, in order to find out where the problem is coming from throwing an error to produce a stacktrace would be very helpful.

Possible Implementation

There could be a Preference or global flag that allows switching errors instead of warning for wrong precision inputs. This would also be made consistent with CUDA.allowscalar.

Jun 07 '24 10:06 lassepe

This seems like a good idea.

Maybe it should literally be the same switch as CUDA.allowscalar, to avoid introducing ever more functions you have to know about? It's not exactly the same meaning, but it's likely to be used at the same time.

That function belongs to GPUArraysCore, which Flux doesn't directly load right now, but NNlib does.

Jun 11 '24 13:06 mcabbott

When you say "same switch" do you mean defining something like Flux.throw_sanity_check_errors() which flips the requested precision switch in the OP as well as turning off CUDA.allowscalar? Or do you mean if CUDA.allowscalar is false, then we throw errors for precision? If it's the latter, then that seems wrong to me since scalar indexing and floating point precision are totally unrelated.

Jun 11 '24 23:06 darsnack

I also think it makes sense to have a separate function for each of these checks. But having a Flux.enable_all_sanity_check_errors() on top of that is good for discoverability (also since the docstring of that can also point to the relevant sub-functions).

Jun 12 '24 08:06 lassepe

seems wrong to me since scalar indexing and floating point precision are totally unrelated

This was my suggestion! Float precision is a big deal on GPU and not otherwise. There are basically two modes of using it:

You just load CUDA, and most things work (but may be super-slow & give warnings). Probably you want friendly warnings from Flux too.
Or you say CUDA.allowscalar(false) and then things will either be fast, or else give an error. You do this once you have it basically working. Probably you want Flux to also tell you if you are doing anything slow?

Of course we could invent some new switch that we own to control this. But then it's one more mysterious function you have to know about. One more kind of mutable state.

Oct 26 '24 00:10 mcabbott

Note that CUDA.jl has switched the default to be disallowing scalar access.

Maybe that means using the same switch is a worse idea. So if we own a switch, what's a good name for it?

julia> using CUDA

julia> first(CUDA.randn(32))
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
...

julia> CUDA.allowscalar(true)
┌ Warning: It's not recommended to use allowscalar([true]) to allow scalar indexing.
│ Instead, use `allowscalar() do end` or `@allowscalar` to denote exactly which operations can use scalar operations.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:188

julia> first(CUDA.randn(32))
0.37104183f0

(@v1.11) pkg> st CUDA
Status `~/.julia/environments/v1.11/Project.toml`
  [052768ef] CUDA v5.5.2

Nov 05 '24 04:11 mcabbott