Dan Snow
Dan Snow
Just for fun: if there's any free time during the summer we should do a quick (day or two) spike of a model pipeline rewrite using a Python stack. I...
Multi-card sales are excluded from the sales used to train the model for multiple reasons. The model predicts values per card, not per property; as such, multi-card sales are excluded...
Now that https://github.com/DyfanJones/noctua/pull/215 is merged, we should update this repo to use the new option, preferably by setting it globally within `noctua_options(unload = TRUE)`. This will require a bit of...
Previously, the CCAO attempted to create a stacked/ensemble model using tidymodels functions. However, tidymodels' support for this method was at the time quite new, and it didn't work very well....
The Data Department recently performed some model benchmarking ([ccao-data/report-model-benchmark](https://github.com/ccao-data/report-model-benchmark)) comparing the run times of XGBoost and LightGBM. We found that the current iteration of XGBoost runs much faster than LightGBM...
In 2022, we improved the townhome valuation methodology by implementing "fuzzy grouping". Basically, townhome units with similar, but not perfectly identical, features should receive similar values. Valuations pointed out that...
# Problem Within LightGBM, [`num_leaves`](https://lightgbm.readthedocs.io/en/latest/Parameters.html#num_leaves) is capped at 2 ^ [`max_depth`](https://lightgbm.readthedocs.io/en/latest/Parameters.html#max_depth). For example, if `num_leaves` is set to 1000 and `max_depth` is set to 5, then LightGBM will likely end...
Thanks for creating this excellent package. I created a [similar fork of treesnip](https://gitlab.com/ccao-data-science---modeling/packages/lightsnip) but am planning to replace it with `{bonsai}` in all our production models. One feature that I...