data_mining:xgboost

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
data_mining:xgboost [2019/05/03 01:10] phreazerdata_mining:xgboost [2020/08/02 16:11] phreazer
Line 22: Line 22:
 ===== Gradient boosting ===== ===== Gradient boosting =====
  
-$F$ is space of functions containing all regression trees +  * $F$ is space of functions containing all regression trees 
-$K$ is number of trees +  $K$ is number of trees 
-$f_k(x_i)$ is regression tree that maps a attribute to a score+  $f_k(x_i)$ is regression tree that maps a attribute to a score
  
 Learn functions (trees) instead of weights in $R^d$. Learn functions (trees) instead of weights in $R^d$.
Line 95: Line 95:
 $$\sum^n_{i=1} [l(y_i,\hat{y}_i^{(t-1)}) + g_if_t(x_i) + \frac{1}{2}h_if_t^2(x_i)]$$ with $g_i=\delta_{\hat{y}^{(t-1)}} l(y_i,\hat{y}^{(t-1)})$ and $h_i=\delta^2_{\hat{y}^{(t-1)}} l(y_i,\hat{y}^{(t-1)})$ $$\sum^n_{i=1} [l(y_i,\hat{y}_i^{(t-1)}) + g_if_t(x_i) + \frac{1}{2}h_if_t^2(x_i)]$$ with $g_i=\delta_{\hat{y}^{(t-1)}} l(y_i,\hat{y}^{(t-1)})$ and $h_i=\delta^2_{\hat{y}^{(t-1)}} l(y_i,\hat{y}^{(t-1)})$
  
-With removed constants+With removed constants (and square loss)
 $$\sum^n_{i=1} [g_if_t(x_i) + \frac{1}{2}h_if_t^2(x_i)] + \Omega(f_t)$$  $$\sum^n_{i=1} [g_if_t(x_i) + \frac{1}{2}h_if_t^2(x_i)] + \Omega(f_t)$$ 
 So that learning function only influences $g_i$ and $h_i$ while rest stays the same. So that learning function only influences $g_i$ and $h_i$ while rest stays the same.
  • data_mining/xgboost.txt
  • Last modified: 2020/08/02 16:12
  • by phreazer