Add option to throw error on passing wrong precision floats to layers
Motivation and description
The warning about wrong precision is very helpful to point at potential performance issues
https://github.com/FluxML/Flux.jl/blob/2f19e681040516db9803111819fcfaf5b44f07eb/src/layers/stateless.jl#L60
I think that this is the correct default behavior. However, in order to find out where the problem is coming from throwing an error to produce a stacktrace would be very helpful.
Possible Implementation
There could be a Preference or global flag that allows switching errors instead of warning for wrong precision inputs. This would also be made consistent with CUDA.allowscalar.
This seems like a good idea.
Maybe it should literally be the same switch as CUDA.allowscalar, to avoid introducing ever more functions you have to know about? It's not exactly the same meaning, but it's likely to be used at the same time.
That function belongs to GPUArraysCore, which Flux doesn't directly load right now, but NNlib does.
When you say "same switch" do you mean defining something like Flux.throw_sanity_check_errors() which flips the requested precision switch in the OP as well as turning off CUDA.allowscalar? Or do you mean if CUDA.allowscalar is false, then we throw errors for precision? If it's the latter, then that seems wrong to me since scalar indexing and floating point precision are totally unrelated.
I also think it makes sense to have a separate function for each of these checks. But having a Flux.enable_all_sanity_check_errors() on top of that is good for discoverability (also since the docstring of that can also point to the relevant sub-functions).
seems wrong to me since scalar indexing and floating point precision are totally unrelated
This was my suggestion! Float precision is a big deal on GPU and not otherwise. There are basically two modes of using it:
- You just load CUDA, and most things work (but may be super-slow & give warnings). Probably you want friendly warnings from Flux too.
- Or you say
CUDA.allowscalar(false)and then things will either be fast, or else give an error. You do this once you have it basically working. Probably you want Flux to also tell you if you are doing anything slow?
Of course we could invent some new switch that we own to control this. But then it's one more mysterious function you have to know about. One more kind of mutable state.
Note that CUDA.jl has switched the default to be disallowing scalar access.
Maybe that means using the same switch is a worse idea. So if we own a switch, what's a good name for it?
julia> using CUDA
julia> first(CUDA.randn(32))
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
...
julia> CUDA.allowscalar(true)
┌ Warning: It's not recommended to use allowscalar([true]) to allow scalar indexing.
│ Instead, use `allowscalar() do end` or `@allowscalar` to denote exactly which operations can use scalar operations.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:188
julia> first(CUDA.randn(32))
0.37104183f0
(@v1.11) pkg> st CUDA
Status `~/.julia/environments/v1.11/Project.toml`
[052768ef] CUDA v5.5.2