stan
stan copied to clipboard
add tuple type to Stan language
Summary:
add type tuple to Stan language. A tuple is a container type which consists of an ordered list of element types. In theory, the list could be length 0 or 1; in practice we should restrict this to list of length strictly greater than 1.
Description:
A tuple is a container type which consists of an ordered list of element types. Tuple elements can be any Stan container or primitive type.
Example of tuple declarations, including compound declaration/definition, using parenthesis to enclose the list of element types and initial element values:
tuple(int, real) pair_i_r;
tuple(int, real) pair_i_r = ( 1, 2.0 );
tuple(int, real) pair_i_r = ( 1, 2 ); # promote ints to reals
tuple(int, tuple(int,real)) pair_i_tuple_i_r;
tuple(int, tuple(int,real)) pair_i_tuple_i_r; = ( 1, (2, 3.0));
Tuple elements are accessed by position, counting from 1, e.g.:
tuple(int, real) pair_i_r = ( 1, 2.0 );
tuple(int, tuple(int,real)) pair_i_tuple_i_r; = ( 1, (2, 3.0));
int i = pair_i_r.1;
real r = pair_i_tuple_i_r.2.2;
Note: could use symbols other than parens and period for these operations.
Additional Information:
Implementation will proceed stepwise. For this issue, we will add the necessary nodes to stan/lang/ast and callback functions in stan/lang/generator .
Current Version:
v2.17.0
@mitzimorris So... I finally have a model on hand where having tuples of non-identical matrices would be fantastic and I have two questions about this feature: 1) does it cover tuples as containers of non-identical matrices; and 2) how's it going? want help?
Yes, it'll allow heterogeneous types. But they're predeclared and sized like everything else (except as function arguments, where there are no sizes).
In practice, it's going to look like this:
(matrix[M, N], matrix[P, Q], int) y;
y.1 = mat1;
y.2 = mat2;
y.3 = 3;
In compound form with tuple expressions:
(matrix[M, N], matrix[P, Q], int) y = (mat1, mat2, 3);
There's some ongoing discussion about naming slots---basically defining structs rather than tuples.
It is not going to be an arbitrary dynamically typed dictionary (Python) or list (R) type---that'd break our strong static typing, which I very much want to keep.
That's awesome.... though can the indexing here be a variable? E.g:
int i;
x = y[i] * 2.0
I guess I'm one more failing test short of my own PR and then I said I'd test stuff for MPI... then I'd love to help with this if possible :)
Afraid not. That would break the ability to do static indexing. I'm sure everyone from the R/Python world are going to want list/dictionary types with dynamic typing, but that's not what this is.
I don't so much care about the R/Python world and dynamic typing but allowing integers for indexing (e.g.-from the data block) would make it easier to use the data block to specify interfaces (e.g.- a joint GLM with K model matrices). Oh well.
Integers for indexing leads to dynamic typing of the return. The data is dynamic---that is, it happens after compile time.
I do understand there are a lot of use cases where general lists would help.
I can't always get what I want.
we don't want tuples, we want structs, i.e., names associated with each element. structs are on the roadmap for the Stan 3 language. we should update this issue accordingly.
we don't want tuples, we want structs, i.e., names associated with each element. this feature is on the roadmap for the Stan 3 language. we should update this issue accordingly.
This is a critical feature, but it's independent of Stan 3. Stan 3 is the opportunity to remove deprecated functions. We can add compatible stuff independently.
If we're going to have struct type objects, we need a design for how they're declared, defined, and how their elements are accessed. I'm all for having struct, by the way---just saying we need the design.
I would absolutely love structs for my project. The amount of function arguments in my project is getting extreme.
Here's a syntax option. C based, we might need a new types block...
types {
struct beta_parameters {
real alpha;
real beta;
};
}
functions {
beta_parameters myfunc(beta_parameters input){
// New Declaration
beta_parameters param = { 1, 2 };
// Alternative Declaration syntax
beta_parameters param2 = { beta = 2, alpha = 1};
// Index
real alpha = input.alpha;
// Assign
param.beta = param.alpha;
return param;
}
}
Pardon the nonsensical example. This is pretty much just C with the exception of the alternative declaration syntax, which is a bit cleaner.
I'm not really sure about the new block, but I'm unsure about where I would put declarations. Other than just putting it in "functions" or even just top level.
They could pretty easily be used as parameters as well as data inputs.
Am I in the wrong ballpark? If I get enough interest I might try implementing it
Tuples are currently a WIP over on the stanc3 github, see this PR for more info: https://github.com/stan-dev/stanc3/pull/675