EnsembleKalmanProcesses.jl
EnsembleKalmanProcesses.jl copied to clipboard
Named tuples for parameters
For applications with large numbers of parameters input into complex models, it's useful (even necessary) to use objects like NamedTuple
that associate parameter names with values to hold / transport lists of parameters.
However, as far as I can tell, parameters are constrained to be vectors. Throughout the code in fact, many data structures seem constrained to Array
or Vector
. Eg, this code does not work:
using Distributions
using LinearAlgebra
using Random
using EnsembleKalmanProcesses.EnsembleKalmanProcessModule
using EnsembleKalmanProcesses.ParameterDistributionStorage
using EnsembleKalmanProcesses.EnsembleKalmanProcessModule: EnsembleKalmanProcess, construct_initial_ensemble
rng_seed = 41
Random.seed!(rng_seed)
# The system
θ★ = (a=1, b=-1)
G₁(θ) = [sqrt((θ.a - θ★.a)^2 + (θ.b - θ★.b)^2)]
# Hyperparameters
N_ensemble = 50
noise_level = 1e-3
unconstrained_prior_distributions = (a=Parameterized(Normal(0, 1)), b=Parameterized(Normal(0, 1)))
constraints = (a=no_constraint(), b=no_constraint())
parameter_names = ("a", "b")
prior = ParameterDistribution(unconstrained_prior_distributions, constraints, parameter_names)
and I get
julia> prior = ParameterDistribution(unconstrained_prior_distributions, constraints, ["a", "b"])
ERROR: MethodError: no method matching ParameterDistribution(::NamedTuple{(:a, :b), Tuple{Parameterized, Parameterized}}, ::NamedTuple{(:a, :b), Tuple{Constraint, Constraint}}, ::Vector{String})
Is it necessary to constrain parameters to type Vector
? It seems like it should be possible to generically support any iterable object (or objects that support broadcasting?) In the case that we need to perform LinearAlgebra
operations, we can constructor the appropriate matrices or vectors from iterable containers as needed:
julia> parameters = (a=1, b=2)
(a = 1, b = 2)
julia> parameter_vector =[θ for θ in parameters]
2-element Vector{Int64}:
1
2
Allocations could even be minimized, though I think this is usually unnecessary because typical applications of EnsembleKalmanProcesses.jl
have computations dominated by the cost of evaluating a loss function.
I'm happy to get to work on refactoring the internals to support more general parameter containers if someone can identify the minimal requirements for the parameters object. It's tough to figure out what methods would have to be defined without detailed knowledge of the code because the struct
definitions are over-constrained, preventing experimentation.
Internally to the algorithms, as you say we treat parameter instances as columns in a single ensemble matrix. This is because once we first initialize prior mean and covariances, and have an initial ensemble, we work with Gaussian methods (completely determined by this information). Therefore everything is treated as vectors in R^d and actions are matrix-vector type operations. We have src/DataStorage.jl
to keep e.g dimensions consistent so these operations succeed. We do require parameter information again, when passing back to the physical model and there we just need to give back the values to the right parameter.
at the moment this is done by assigning the name to the column see the (terribly named) batch
function in src/ParameterDistribution.jl
function batch(pd:ParameterDistribution)
Returns a list of contiguous [collect(1:i), collect(i+1:j),... ] used to split parameter arrays by distribution dimensions
In general applications parameters may be defined as multidimensional too e.g with values a= [0.0], b = [1.0,2.0,4.0,6.0,9.6]
(a
is defined with a 1D prior distribution, b
is defined with a 5D prior distribution). In this case for example with [a,b]
, the parameter matrices will be 6 by ensemble size, and the batch
function will return [collect(1:1), collect(2:6)]
on return to slice back into a
and b
.
I'm sure the user interface could be cleaner as I certainly didnt optimize it. But internally I felt it was quite robust and doesnt require more functionality than Ensemble Kalman methods require (e.g matrices and vectors with consistently enforced dimensions)
I think what I'm asking is what methods need to be defined for a parameter type. We can relax the type restrictions in the struct definitions, and design an interface based on methods. Then many types could be used by savvy users to store parameters, provided those types define the right methods. Flexible parameter types might be useful for complicated loss functions (such as ours).
I'll experiment a bit and open a PR that will hopefully illustrate what I'm talking about!
Just to provide a simple example, broadcasting works for any subtype of AbstractArray
provided that it defines a couple methods like size
, getindex
, setindex!
, etc. We don't have to broadcast with literal Array
--- we can broadcast with any "array-like" object. Similar flexibility and julian design here would create a powerful and flexible infrasturcture.
Internally to the algorithms, as you say we treat parameter instances as columns in a single ensemble matrix. This is because once we first initialize prior mean and covariances, and have an initial ensemble, we work with Gaussian methods (completely determined by this information). Therefore everything is treated as vectors in R^d and actions are matrix-vector type operations
I think what I am saying is that there are many julia types that may be used to represent "vectors in R^d". I think it would help to write the software in a way that supports many vector-like types --- not just Vector
. Here's a simple example from the julia documentation:
struct SquaresVector <: AbstractArray{Int, 1}
count::Int
end
Base.size(S::SquaresVector) = (S.count,)
Base.IndexStyle(::Type{<:SquaresVector}) = IndexLinear()
Base.getindex(S::SquaresVector, i::Int) = i*i
Note that SquaresVector
is not the type Vector
, but nevertheless has properties indistinguishable from a mathematical vector in R^1:
julia> s = SquaresVector(4)
4-element SquaresVector:
1
4
9
16
julia> s[s .> 8]
2-element Vector{Int64}:
9
16
julia> s + s
4-element Vector{Int64}:
2
8
18
32
julia> sin.(s)
4-element Vector{Float64}:
0.8414709848078965
-0.7568024953079282
0.4121184852417566
-0.2879033166650653
This shows why we don't want to overrestrict types in struct definitions --- allowing flexible types generates the possibility for powerful behaviors. "Alternative array types" like SquaresVector
only need to define the correct methods for the code to work. The methods that need to be defined are sometimes surprisingly few.
Essentially, we almost never should restrict the types of objects, except in very special cases (which probably do not occur in our software).
@glwagner We should move away from deleting the typing in arguments or objects of structs. It adds flexibility at the cost of interpretability of the codebase. What we should do is substitute the types by more general (possibly abstract) types.
I think changing the typing to more abstract types would really benefit the codebase. However, unit testing becomes slightly more difficult (there are a lot of corner cases). I am not familiar with this kind of typing in Julia, but we should implement types similar to those in the collections.abc
python package.
You cannot capture what @glwagner is promoting with abstract types because the code used here is a composition of required type traits / interfaces (ex. generic iteration, get/set index, etc.). No one abstract type hierarchy will necessarily capture the composition of traits that the package needs. In general that is why you rely on dynamic polymorphic behavior (ex. no contraint on types) and raise error messages when a particular type instance does not implement some method. This makes the code compositional and easier to extend. Because Julia doesn't have explicit declaration of type traits you rely on implicit runtime behavior (like all dynamic languages), comments, and a good test suite.
I am not sure I understand why types cannot capture everything we need. If you need to say that an argument allows generic iteration, it should be an iterable. If it needs to be used as a dict key, it is a hashable.
An example of how this is possible, while maintaining extreme generality of the inputs in a dynamically typed language is python's xarray, see some of the function handles here, for instance. Typing helps a lot with debugging, which is kind of the weakest point of Julia. It's either that or adding asserts in every function.