Max Ghenis comments

Results 181 comments of


                                            Max Ghenis

Imputation of e02000, e26270, p23250, p22250 from PUF to CPS

CV is helpful for variable selection and other tuning parameters; for example, random forests do something like CV as part of the algorithm, and there are prebuilt CV methods for...

Imputation of e02000, e26270, p23250, p22250 from PUF to CPS

You're comparing two models: 1. Random forests (just a single model) 2. Trinomial logit + two linear models for positive and negative logit predictions In each case you're evaluating on...

Imputation of e02000, e26270, p23250, p22250 from PUF to CPS

Logit + 2 RFs could be a third model, but RF alone is worth testing and I'd personally start with a single RF vs. logit+LM. The single RF will perform...

Imputation of e02000, e26270, p23250, p22250 from PUF to CPS

> This would probably lower the MSE but would avoid the "zero-fuzzing" that occurs when using the average of the entire row for all observations. You should select a random...

Imputation of e02000, e26270, p23250, p22250 from PUF to CPS

> I am still a bit unclear on the motivation behind inserting randomness, which underlies some of my incorrect assumptions going in to this process (directly imputing predicted categories in...

Imputation of e02000, e26270, p23250, p22250 from PUF to CPS

Here's an [example](https://github.com/shahejokarian/regression-prediction-interval/blob/master/linear%20regression%20with%20prediction%20interval.ipynb) using `sklearn` for prediction intervals. It's not clear whether it works for regularized models like Lasso.