Abraham-Leventhal
Abraham-Leventhal
@MaxGhenis OK thanks, I will implement that. I am going to post an initial question regarding my OLS model soon as well.
My first issue i've run into using an OLS model that regresses on log(P22250) is that the model has so far been producing absurd predictions (in the billions) for P22250,...
Based on further reading it seems that to justify log-transforming our predictors I should be looking at how that transformation changes the model's residuals - either making them symmetrical or...
Looking at the residuals it doesn't appear that log-transforming the predictors is having much of a beneficial effect, despite the predictions being somewhat more realistic. #### Regressing on log(P22250) without...
@hdoupe > By "average performance across multiple", do you mean Cross-Validation? Cross validation requires that you run K training/testing iterations for each model you are evaluating, such that for each...
@hdoupe Sure, the method makes sense and if you think K-fold cross validation would be best I would go along with that. > But, if you choose your predictors based...
@MaxGhenis Thanks for the suggestions, I'm working on comparing LassoCV and your earlier random forests model for this stage of the project (continuous predictions). Question: As both lasso and random...
Assuming that we'd go with the latter option (training both models only on the training dataset), I tried out both random forests and LassoCV in this notebook and it appears...
OK, I had thought that since mnlogit produced better categorical predictions for P22250 than random forests (using log loss as our metric) we'd only consider mnlogit as a sign-prediction model...
@MaxGhenis > Logit + 2 RFs could be a third model, but RF alone is worth testing and I'd personally start with a single RF vs. logit+LM. The single RF...