🌳 Gradient Boosting (XGBoost, LGBM, CatBoot)
Gradient Boosting (GBM)
In theory are just as fast to train as random forests, but in practice you will have to try lots of different hyperparameters. They can overfit. At inference time they will be less fast, because they cannot operate in parallel. But they are often a little bit more accurate than random forests.
🔧 Hyperparameters
sklearn Random Forest |
XGBoost Gradient Boosting |
LightGBM Gradient Boosting |
Try | |
---|---|---|---|---|
🔷 Number of trees | N_estimators | num_round 💡 | num_iterations 💡 | [10,…,1000] |
🔷 Max depth of the tree | max_depth | max_depth | max_depth | 3,…,10 |
🔶 Min cases per final tree leaf | min_samples_leaf | min_child_weight | min_data_in_leaf | |
🔷 % of rows used to build the tree | max_samples | subsample | bagging_fraction | 0.8 |
🔷 % of feats used to build the tree | max_features | colsample_bytree | feature_fraction | |
🔷 Speed of training | NOT IN FOREST | eta | learning_rate | |
🔶 L1 regularization | NOT IN FOREST | lambda | lambda_l1 | |
🔶 L2 regularization | NOT IN FOREST | alpha | lambda_l2 | |
Random seed | random_state | seed | _seed |
- 🔷: Increase parameter for overfit, decrease for underfit.
- 🔶: Increase parameter for underfit, decrease for overfit. (regularization)
- 💡: For Gradient Boosting maybe is better to do early stopping rather than set a fixed number of trees.