xgboost-adv-workshop-LA
xgboost-adv-workshop-LA copied to clipboard
to discuss: main points of xgboost paper on arxiv
https://arxiv.org/abs/1603.02754
Eq. 6 or Algo 1: what's the default gamma
and lambda
and how to change it?
( https://github.com/dmlc/xgboost/blob/master/doc/parameter.md )
How different is this from standard gradient boosting?
3.4 Sparsity-aware Split Finding versus NEWS file: Changes in R library: switched from 0 to NA for missing values.
(4.1) Data in each block is stored in the compressed column (CSC) format, with each column sorted by the corresponding feature value.
cache-aware prefetching
not in the paper, but anyways: Allreduce runtime called Rabit; Distributed XGBoost YARN on AWS; JVM/Spark
on disk: https://xgboost.readthedocs.io/en/latest/how_to/external_memory.html
4.1 Column Block (data structure for parallel learning)