ashr general ash

@mengyin can you take a look at the code I've got in my general branch? I think it incorporates much, but not yet all, of what you had in your branch, but without all the repetition. It seems to run OK. There are still some things to do, but thought it would be useful for you to look and see if, for example, the results for logF errors look similar to yours. (note that for now I think df1 and df2 have to be the same for all observations...; changing this should not be too hard, but not done yet)

To see how to run it, see my test in tests/testthat/test_logF.R set.seed(10) e = rnorm(100)+log(rf(100,df1=10,df2=10)) e.ash = ash(e,1,lik=logF_lik(df1=10,df2=10)) s = sd(log(rf(10000,10,10))) # get approximate standard deviation by simulation e.ash.norm = ash(e,s) expect_equal(e.ash.norm$PosteriorMean,e.ash$PosteriorMean,tol=0.05)

and also the examples under ?ash.workhorse

Jul 04 '16 05:07 stephens999

Yes the general branch's results look very similar to my old results! Anyway in my current simulations df1 and df2 are always the same for all observations, so it is definitely fine...

Jul 06 '16 00:07 mengyin

that's good. A few things to resolve:

you wrote a replacement to my "gradient" function that seems to be a fix to a divide by zero problem. I guess you got some errors that caused this problem, and we could just replace my function with yours?
still need to deal with the autoselect issue, and work out how to let the user specify a way to select a grid that depends on the likelihood. Maybe the autoselect method should be specified when specifying likelihood (or a default supplied)
may want to allow the user to just specify pdf of likelihood, and compute cdf by integrate? This might be a future feature.
I have an implementation that allows each observation to have its own likelihood about to be pushed to the general-multilik branch, but I'm not 100% happy with the way I did it... still thinking about it.

Jul 06 '16 01:07 stephens999

Yes I got errors sometimes with the old "gradient" function, replacing it by my new gradient function fixed the problem divide by zero and didn't affect results in general.
For the autoselect function, my first thought was choosing the grid based on the shape of error distribution...but computing the distribution's moments itself can be non-trivial (numerical integration is unstable). Anyway I guess it's no big deal as long as the chosen grid's range is wide enough.
Yes it'd be great if the user just need specify pdf (or just cdf and compute pdf by numerical differentiation? might be unstable though)!

Jul 06 '16 03:07 mengyin

for autoselect, maybe we could use the cdf instead of the moments? I only used moments in the normal case as a shorthand for making sure the grid spanned "plausible" values.

Jul 06 '16 12:07 stephens999

I see..say, use the values where the cdf is almost 1 (or 0) to decide the "plausible" range?

Jul 07 '16 00:07 mengyin

ashr ashr copied to clipboard

general ash

ashr
ashr copied to clipboard