StructEst_W20
StructEst_W20 copied to clipboard
Results for Gamma, Generalized gamma, GB2
Hi professor,
For questions 1(b), 1(c) and 1(d), I actually got quite similar results.(both for parameters and maximum likelihood) I feel weird and I got nonsense result for question 1(e) because of that.
But I also think the estimation results might be reasonable, since gamma is a special case of generalized gamma(when m =1) and generalized gamma is a special case of GB2,(when q goes to infinity) and we set initial m and q as making such special cases.
Would my result and thought be on right track, or would there be something wrong?
Thank you very much.
@takando . Post a plot of your estimated distributions laid over your histogram plot. Mine looks like this. The GG and GB2 are pretty similar.
Thank you for the plot. My plot is as below. Actually Gamma and GG are slightly different and GG and GB are totally same...
One more things, I made a plot of actual health expenditure in 1(a) as in a scale of frequency but here I re-scaled it to density(by setting density=True) since our pdf functions generate probability density. So I am also not sure about this scaling issue.
I have the same puzzle, I think the graph of @rickecon is also a density plot histogram instead of a percentage histogram. Also, The density histogram you plot is actually not adjusted for those data over $800.
I tried to use the adjusted data density and estimated GA density curve, and it looks like below:
This is interesting. I can get same density scale plot as yours by dividing frequency with the width of bins, as below.
But still the shape of gamma distribution is quite different. What is your maximum log likelihood for gamma distribution? Mine is around -56700 and Im not sure if it is too large or not.
I guess @takando used the data about only the expenditures >= 800 to estimate MLE, don't you? My thought, we should use the whole dataset rather than >= 800 dataset for the estimations.
Here is my result (for the scale, I adjusted weight so that the area of the histogram would be about 0.85 because this histogram shows just about 85% of entire observations).
Oh yes..you're right! we limited the range of data to <800 just because of the visualization reason, and we should still make an estimation for entire data.
Thanks so much!
This is my plot, the curves for GB2 and GG is closer, but still have some visible difference.
@takando @Kent0417 @cytwill . I just made a big post on Issue #4 (specific comment link) about how to normalize the histogram.
And yes, you need to estimate your distributions using the entire, uncensored data. I got an answer similar to what @cytwill just posted here, as you can see in my comment in Issue #4 .