simstudy
simstudy copied to clipboard
Sequential dependent simulation
I just added some code in the non-prop-odds-assumption branch that shows it is possible to generate data that is sequentially dependent over time. The user would specify a data definition that looks like this:
varname formula variance dist link
1: y0 0 2 normal identity
2: x0 0 4 normal identity
3: y<t> .5*x<t-1> + .8*y<t-1> 2 normal identity
4: x<t> .6*x<t-1> + 0.4*x<t-2> 4 normal identity
where t represents a time period. The new function essentially converts this definition table into an expanded definition table that builds y1, x1, y2, x2, etc depending on how periods the user specifies. Here is what the new table would look like based on t=5:
varname formula variance dist link
1: y0 0 2 normal identity
2: x0 0 4 normal identity
3: y1 .5*x0 + .8*y0 2 normal identity
4: x1 .6*x0 + 0.4*x0*0 4 normal identity
5: y2 .5*x1 + .8*y1 2 normal identity
6: x2 .6*x1 + 0.4*x0 4 normal identity
7: y3 .5*x2 + .8*y2 2 normal identity
8: x3 .6*x2 + 0.4*x1 4 normal identity
9: y4 .5*x3 + .8*y3 2 normal identity
10: x4 .6*x3 + 0.4*x2 4 normal identity
11: y5 .5*x4 + .8*y4 2 normal identity
12: x5 .6*x4 + 0.4*x3 4 normal identity
Once this wide file is generated, a long version can easily be created, so the final data set looks like this:
id period y x
1: 1 0 -0.6700170 0.1641139
2: 1 1 -2.0767351 0.5040484
3: 1 2 -1.4610563 -0.4516962
4: 1 3 -2.4996284 -3.6780939
5: 1 4 -4.3509236 -2.2907568
---
596: 100 1 2.5804474 0.4800748
597: 100 2 1.5806715 1.8110704
598: 100 3 2.6266267 -1.5400280
599: 100 4 1.3507070 -2.9591094
600: 100 5 -0.8595617 -2.9844094
Obviously, much to be remains to be done or improved, but this is the idea.
This looks very useful but it would not work as expressions (#75 ) as this is not valid r syntax. I am sure there is a way we can do it though!