Most implementations of gradient boosting are configured by default with a relatively small number of trees, such as hundreds or thousands. The general reason is that on most problems, adding more trees beyond a limit does not improve the performance of the model.
The reason is in the way the boosted tree model is constructed—sequentially—where each new tree attempts to model and correct the errors made by the sequence of the previous trees. Quickly, the model reaches a point of diminishing returns.
We can demonstrate this point of diminishing returns easily on the Otto dataset. The number of trees (or rounds) in an XGBoost model is specified to the XGBClassifier or XGBRegressor class in the