MatlabStan icon indicating copy to clipboard operation
MatlabStan copied to clipboard

programmatically obtain stats

Open drbenvincent opened this issue 8 years ago • 6 comments

Hi Brian Another feature request, sorry...

Currently, it seems that doing output = stanFit.print(); outputs all the raw text as a cell array.

I wondered if it was possible to obtain selected stats for particular variables using some command of the form: selectedStats = fit.getStats({'mu','tau'}, {'mean','R_hat','50%'}); Which could return mean, 50% and R_hat for the variables listed.

Alternatively, the entire table of stats could be returned in a matlab table format.

This would be really great. I have a toolbox released alongside a paper last year, and I am trying to convert it from using JAGS to STAN. I think this is one of my last stumbling blocks to getting it all working. I tried to look into the code to see if it was easy to do myself, but it doesn't immediately obvious to me.

Ben

drbenvincent avatar Apr 04 '16 15:04 drbenvincent

Having looked into it a bit more, it seems like CmdStan doesn't provide info in this level of granularity, just the text output of print or stansummary.

So looks like the solution will have to be:

  • save this to a text file: around line 331 in StanFit.m
  • add a function to parse this file to get the info I need

drbenvincent avatar Apr 05 '16 07:04 drbenvincent

It seems feasible to parse the raw text to be searchable. However, if you just need the means or percentiles, you could compute them on the fly in Matlab. If you need R_hat on the other hand, it will have to come from the text output.

brian-lau avatar Apr 05 '16 11:04 brian-lau

Thanks Brian. Yes, I'm after R_hat amongst other things. I might try to fork, and add a parser to put the values in some kind of structure. Although I'm not so experienced with string processing.

drbenvincent avatar Apr 05 '16 11:04 drbenvincent

I just saw this post. Here is a snippet of code I use to grab r_hat and n_eff from a StanFit model for each row that I care about. I use this to check convergence of lp__, mu_, sig_u_, and sig_e_. This will pull r_hat and n_eff values for those rows (regardless of the number of rows that have those parameters). If you care about other rows, you can just add in the text to the inp2check cell array.

This code may or may not be helpful as you'll still have to compute any means or percentiles you want, but it may at least help you pull the values you care about. (unless I was too slow to respond...)

%pull out r_hats from Stanfit model

%pull summary using the print function
%there's no other way to get at the r_hat values, so output will also be
%printed in the command window
output = print(fit);

%define cell array that holds the r_hats
convstats = cell(0,3);
convstats{1,1} = 'name';
convstats{1,2} = 'n_eff';
convstats{1,3} = 'r_hat';

%cycle through the important inputs
%depending on whether there are multiple events/groups there may be 
%multiple mu_, sig_u_, and sig_e_
inp2check = {'lp__' 'mu_' 'sig_u_' 'sig_e_'}; 

for j = 1:length(inp2check)
    check = strncmp(output,inp2check{j},length(inp2check{j}));
    ind2check = find(check == 1);
    for i = 1:length(ind2check)
        pullrow = strsplit(output{ind2check(i),:},' ');
        label = pullrow{1};
        neff = str2num(pullrow{end-2});
        rhat = str2num(pullrow{end});
        convstats{end+1,1} = label;
        convstats{end,2} = neff;
        convstats{end,3} = rhat;
    end
end

peclayson avatar Apr 19 '16 02:04 peclayson

Thanks! Not too late, but this is on hold while I do exam marking :( I am also going to write to the cmdstan folks to see if request specific info. Will post any progress here.

drbenvincent avatar Apr 26 '16 10:04 drbenvincent

The print method now returns a table variable as a second output (commit d99f570da), and is documented in the wiki: https://github.com/brian-lau/MatlabStan/wiki/CmdStan-summary

I'm implementing chain stats more generally in the mcmc class, and will expose these to StanFit objects shortly

brian-lau avatar May 16 '17 05:05 brian-lau