ModelParameters.jl icon indicating copy to clipboard operation
ModelParameters.jl copied to clipboard

Param in an array turns into Float when accessing array.

Open thomvet opened this issue 2 years ago • 8 comments

Packing one or more Param() into an Array that also contains some other number and then accessing elements in the array turns the Param()s into Float (or whatever the parent type is). The following code snippet reproduces the behavior:

a = [Param(1.0), 1.0]
typeof(a) #Vector{Number}
a[1] #1.0
typeof(a[1]) #Float64

This is quite unintuitive to me. Is it a bug? When would this be desired?

thomvet avatar Oct 12 '22 13:10 thomvet

This is due to type conversions. When you pack two values with different types into a list, Julia automatically tries to upcast them both to a common abstract type, in this case Number. The current implementation of convert for Param to type Number extracts the value (with units) and returns it. This is mostly helpful for making Param behave like a number in arithmetic expressions, I think, but @rafaqz can weigh in if there is another use case that I am not thinking of.

I agree that this is not intuitive behavior (and the choice to use convert this way arguably violates the method interface since normally conversions do not discard information), at least when putting Params in an array.

The good news is, however, that you should not do this anyway. It's a bad idea to work with arrays that have abstract (mixed) types, as it will almost always kill performance by creating type instability.

It would be a better idea to use a Tuple to store Params alongside other types (numbers or otherwise).

bgroenks96 avatar Oct 12 '22 19:10 bgroenks96

Yes Julia is helpfully trying to stop you doing what you're doing ;)

If you want to do this anyway, you will have to force the array to have type Any or some Union. Which is what you want anyway. Just make it explicit.

a = Any[Param(1.0), 1.0]

(Otherwise use a tuple, which is nearly always the right choice for lists of parameters)

rafaqz avatar Oct 12 '22 20:10 rafaqz

Thanks for your advice and insights. I am aware that heterogeneous arrays are not performant and I agree that feeding in a tuple here would be preferrable here.

The thing is that I wanted to stay flexible on the user side on my (company internal) package, i.e., allowing users to feed in arrays or tuples for the parameters. My user base is not necessarily aware of the difference between tuples and arrays (some are, some aren't). In fact, in the internals of my package, I do have a function that converts parameter arrays into tuples, so that I can deal with them in a typestable manner. I was surprised that this function simply ripped away the Param() types and converted the input into Floats.

It's probably something that I will just have to resolve with documentation and error messages at this point, so that only tuples are allowed to be fed as parameter lists. That's fine with me (just less flexible) if there's no way around this on a more fundamental level. :)

thomvet avatar Oct 13 '22 06:10 thomvet

Ok that makes sense, and does sound like it will be confusing to new people.

The whole Params are also numbers thing is find of a gimmick honestly, it might be better if we just forced stripping them out before using a model.

Edit: also note, other things like ints, floats and bools will do the same thing in arrays, which can also be a problem. It's really best to communicate not to use arrays like this.

And @bgroenks96 that means there is a probably unfortunate precedent for this loss of in formation: Bool looses "truthiness" if mixed with other numbers in arrays:

  julia> [true, false][1] ? "truthy" : "falsy"
"truthy"

julia> [true, 0][1] ? "truthy" : "falsy"
ERROR: TypeError: non-boolean (Int64) used in boolean context
Stacktrace:
 [1] top-level scope
   @ REPL[7]:1

rafaqz avatar Oct 13 '22 06:10 rafaqz

That's not really information loss, though. true and false carry the same amount of information as 1 and and 0. It's just a matter of representation and/or type semantics. Julia rightfully doesn't let you treat integers as booleans without explicitly casting them. Similarly, [1.1, 1] downcasts to Vector{Float64} which is fine since integers are a subset of floating point numbers (reals).

It still can be surprising, no matter what is getting converted, however. And yes, all the more reason to discourage doing this.

bgroenks96 avatar Oct 17 '22 11:10 bgroenks96

Also, yes, I would be in favor of getting rid of the whole Param as a number thing. I always strip them in practice... it doesn't work consistently enough to be actually useful.

bgroenks96 avatar Oct 17 '22 11:10 bgroenks96

I mean Bool is losing type domain information, which removes its main use case. Julia objects hold type information as well as value information, it's even physically contained in the boxing at runtime when there is type instability. And Bool type information is more different than most Real numbers - there are many more places we expect it to work differently. Even indexing will become an error rather than a filter after you convert Bool to Real.

But yes, lets think about removing <: AbstractNumber, I thought it would be more useful than it is.

rafaqz avatar Oct 17 '22 12:10 rafaqz

Sure, but that's pretty consistent with what one expects from convert. It "converts" something of one type into another, so naturally the pre-conversion type will be lost; what is exceptional here, is that we throw away actual data stored in the Param struck when we extract the :val.

That being said, I just realized there is a rather obvious precedent for this: convert(Float32, maxintfloat(Float64)) discards information in the mantissa since Float32 is too small to represent the value. So my original point is kind of moot.

bgroenks96 avatar Oct 17 '22 14:10 bgroenks96