{"id":677,"date":"2020-08-10T03:20:00","date_gmt":"2020-08-10T03:20:00","guid":{"rendered":"https:\/\/tensorzen.online\/?p=677"},"modified":"2024-05-29T02:51:16","modified_gmt":"2024-05-29T02:51:16","slug":"xgboost","status":"publish","type":"post","link":"https:\/\/tensorzen.blog\/?p=677","title":{"rendered":"XGBoost\u5982\u4f55\u63a7\u5236\u8fc7\u62df\u5408"},"content":{"rendered":"\n<p>Boosting\u6a21\u578b\u62df\u5408\u80fd\u529b\u5f88\u5f3a\uff0c\u653e\u5f00\u4e86\u8ba9\u5b83\u62df\u5408\uff0c\u53ef\u4ee5\u4efb\u610f\u7cbe\u5ea6\u7684\u9760\u8fd1\u76ee\u6807\u51fd\u6570\uff0c\u5730\u7403\u4e0a\u7684\u7269\u7406\u89c4\u5f8b\u544a\u8bc9\u6211\u4eec\uff0c\u8d8a\u725b\u903c\u7684\u4e1c\u897f\u8d8a\u5bb9\u6613\u8d70\u6781\u7aef\uff0c\u62df\u5408\u80fd\u529b\u5f3a\u6240\u4ee5\u5c31\u975e\u5e38\u5bb9\u6613\u51fa\u73b0\u8fc7\u62df\u5408(overfitting)\uff0c\u4e8e\u662f\u5404\u7c7bGradient Boosting\u7684\u5de5\u7a0b\u5b9e\u73b0\u90fd\u60f3\u65b9\u8bbe\u6cd5\u7684\u63a7\u5236\u4f4fboosting\uff0c\u522b\u8ba9\u5b83\u592a\u81ea\u7531\u3002\u4e8e\u662f\u6211\u4eec\u6765\u770b\u4e00\u4e0bXGBoost\u5982\u4f55\u63a7\u5236\u4f53\u5185\u5de8\u5927\u7684\u80fd\u91cf\uff0c\u5148\u5b9a\u4e49\u51e0\u4e2a\u7b26\u53f7\u3002<\/p>\n\n\n\n<p>\u6570\u636e\u96c6$D \\in R^{n \\times n}$\u8868\u793a\u6709$n$\u4e2a\u6837\u672c\uff0c\u6bcf\u4e2a\u6837\u672c$m$\u4e2a\u7279\u5f81\u7684\uff0c\u5982\u679c\u4f7f\u7528$K$\u4e2a\u57fa\u5b66\u4e60\u5668\u6765\u62df\u5408\u8bad,\u90a3\u4e48\u9884\u6d4b\u503c$\\hat{y}$:<br>$$<br>\\hat{y_i} = \\phi(x_i) = \\sum_{k=1}^{K}f_{k}(x_i)<br>$$<br>$f_{k}$\u662f\u57fa\u5b66\u4e60\u5668,\u8fd9\u91cc\u57fa\u5b66\u4e60\u5668\u5b9e\u9645\u4e0a\u662f\u56de\u5f52\u6811\uff0c\u7ed9\u5b9a\u4e00\u4e2a\u6837\u672c$x_i$\uff0c\u56de\u5f52\u6570\u5c31\u80fd\u628a\u8fd9\u4e2a\u6837\u672c\u5206\u914d\u5230\u4e00\u4e2a\u53f6\u5b50\u7ed3\u70b9\uff0c\u6bd4\u5982\u628a\u6837\u672c\u5206\u5230\u4e86$l$-th\u4e2a\u53f6\u5b50\uff0c$w_l$\u8868\u793a\u7b2c$l$\u4e2a\u53f6\u5b50\u7ed3\u70b9\u7684score\uff1a<br>$$<br>f_{k}(x_i) = w_{j}<br>$$<br>\u4e8e\u662f\uff0c\u6700\u7ec8\u9884\u6d4b\u51fa\u7684$\\hat{y}$\u5c31\u662f$k$\u68f5\u56de\u5f52\u6811\u9884\u6d4b\u51fa\u7684$k$\u4e2a\u53f6\u5b50\u7ed3\u70b9\u7684scores\u7684\u548c\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">1. \u6b63\u5219<\/h2>\n\n\n\n<p>\u8bf4\u5230\u9632\u6b62\u8fc7\u62df\u5408\uff0c\u5927\u6982\u7b2c\u4e00\u4e2a\u60f3\u5230\u7684\u5c31\u662f\u52a0\u6b63\u5219\u5427\uff0c\u5c31\u50cf\u8bf4\u5230\u780d\u4eba\u5c31\u60f3\u8d77\u897f\u74dc\u5200\u4e00\u6837\u81ea\u7136\uff5e\u603b\u5171\u8981\u8fed\u4ee3$K$\u6b21\u5b66\u4e60$K$\u4e2a\u57fa\u5b66\u4e60\u5668\uff0c\u5728\u6bcf\u4e00\u8f6e\u7684\u8fed\u4ee3\u4e2d\u7ed9\u635f\u5931\u51fd\u6570\u589e\u52a0\u6b63\u5219\u9879\u6765\u7ea6\u675f\u672c\u8f6e\u7684\u5b66\u4e60\u5668\u4e0d\u8981\u5b66\u7684\u592a\u597d\u3002\u5bf9\u4e8e\u56de\u5f52\u6811\u6765\u8bf4\uff0c\u7ed3\u6784\u8d8a\u7b80\u5355\u7684\u6811general\u80fd\u529b\u8d8a\u5f3a\uff0c\u4e8e\u662f\u4e00\u79cd\u76f4\u89c9\u7684\u9632\u6b62\u8fc7\u62df\u5408\u7684\u624b\u6bb5\u5c31\u662f\u7ea6\u675f\u672c\u8f6e\u8fed\u4ee3\u7684\u6811\u4e0d\u8981\u592a\u590d\u6742\uff0cXGBoost\u4e2d\u5bf9\u6811\u7684\u53f6\u5b50\u8282\u70b9\u4e2a\u6570\u548c\u8f93\u51fa\u7684score\u8fdb\u884c\u60e9\u7f5a<\/p>\n\n\n\n<p>$$<br>L^{t} = \\sum_{i=1}^{n} L \\left [ y_i, y_i^{t-1} + f_{t}(x_i) \\right ] + \\Omega(f_{k})<br>$$<\/p>\n\n\n\n<p>$$<br>\\Omega(f_k) = \\gamma T + \\lambda \\sum_{i=1}^{T} || w_{i} ||^2 + \\alpha \\sum_{i=1}^{T} |w_i|<br>$$<\/p>\n\n\n\n<p>$T$\u8868\u793a\u56de\u5f52\u6811$f_{k}$\u53f6\u5b50\u7ed3\u70b9\u7684\u4e2a\u6570, $w_i$\u53f6\u5b50\u8282\u70b9\u5bf9\u5e94\u7684score\uff08\u5206\u5230\u8fd9\u4e2a\u53f6\u5b50\u7ed3\u70b9\u7684\u6837\u672c\u8f93\u51fa\u8fd9\u4e2a\u503c\uff09\uff0c $\\gamma$\u548c$\\lambda$ \u63a7\u5236\u6bcf\u4e00\u9879\u7684\u5f3a\u5ea6\uff0c\u672c\u8eabXGBoost\u4e5f\u662f\u652f\u6301$l_1$\u6b63\u5219\u9879\uff0c\u8bba\u6587\u91cc\u6ca1\u63d0\uff0c\u4ee3\u7801\u91cc\u9762\u6709\uff0c\u4e0b\u9762\u7684\u63a8\u5bfc\u8fd8\u662f\u6309\u6ca1\u6709$L_1$\u7684\u60c5\u51b5\u6765\u3002<\/p>\n\n\n\n<p>\u6c42\u89e3\u4e0a\u8ff0\u516c\u5f0f\u4e2d\u7684$f_{t}$\u5373\u7b2c$t$\u8f6e\u7684\u76ee\u6807\u4e86\uff0c\u53ef\u4ee5\u4f7f\u7528\u4e4b\u524d\u5728GBDT\u4e2d\u4ecb\u7ecd\u7684\u65b9\u6cd5\u6765\u6c42\u89e3\uff0c\u4e0d\u8fc7XGBoost\u91c7\u7528\u4e8c\u9636\u6cf0\u52d2\u5c55\u5f00(Second-ordder approximation)\u6765\u8fd1\u4f3c\u6c42\u89e3\uff0c\u8fd9\u6837\u505a\u7684\u597d\u5904\u6709\u51e0\u70b9\u5427<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u4e8c\u9636\u8fd1\u4f3c\u901f\u5ea6\u66f4\u5feb\uff0c\u5373\u725b\u987f\u65b9\u6cd5(Newton&#8217;s method)<\/li>\n\n\n\n<li>\u76ee\u6807\u51fd\u6570\u5f88\u591a\u65f6\u5019\u662f\u6bd4\u8f83\u590d\u6742\u7684\u975e\u7ebf\u6027(non-linear)\u51fd\u6570\uff0c\u7528\u4e8c\u9636\u8fd1\u4f3c\u76f8\u5f53\u4e8e\u6211\u4eec\u5728\u9884\u6d4b\u503c\u9644\u8fd1\u6784\u9020\u4e86\u4e00\u4e2a\u4e8c\u6b21\u65b9\u7a0b(quandratic functions)$\\frac{1}{2}h_i(\\hat{y}_i &#8211; \\hat{y}_i^{(t-1)})^2$\u4e00\u5b9a\u8303\u56f4\u5185\u7684\u76ee\u6807\u51fd\u6570\uff0c\u8fd9\u4e2a\u6bd4\u539f\u6765\u7684\u590d\u6742\u51fd\u6570\u8981\u8ddf\u66f4\u5e73\u6ed1\uff0c\u66f4\u5bb9\u6613\u6c42\u89e3\u3002<\/li>\n\n\n\n<li>\u76f8\u6bd4\u4e8e\u76f4\u63a5\u62df\u5408\u539f\u76ee\u6807\u51fd\u6570\uff0c\u91c7\u7528\u4e8c\u9636\u8fd1\u4f3c\u5929\u7136\u7684\u5f15\u5165\u4e86bias(\u5e73\u6ed1)\uff0c\u6240\u4ee5\u53ef\u4ee5\u8fdb\u4e00\u6b65\u524a\u5f31overfitting\u7684\u95ee\u9898<\/li>\n<\/ul>\n\n\n\n<p>\u4e0a\u9762\u76ee\u6807\u51fd\u6570\u4e2d\u6211\u4eec\u628a$\\hat{y}^{t} = \\hat{y}^{t-1} + f_{t}$\uff0c\u6211\u4eec\u5c06$l(y_i, \\hat{y}_i^{t})$\u5c55\u5f00\u5230\u4e8c\u9636<\/p>\n\n\n\n<p>$$l(y_i, \\hat{y}_i^{t}) = l(y_i,  \\hat{y}_i^{t-1} + f_{t}) \\simeq l(y_i, \\hat{y}_i^{t-1}) +  g_i f_{t}(x_i) + \\frac{1}{2}h_{i}f_{t}^{2}(x_i)$$<\/p>\n\n\n\n<p>\u5176\u4e2d$g_i=\\partial _{\\hat{y}^{(t-1)}}l(y_i, \\hat{y}^{(t-1)}), h_i = \\partial^2 _{\\hat{y}^{(t-1)}}l(y_i, \\hat{y}^{(t-1)})$\uff0c\u5e26\u5165\u5230\u76ee\u6807\u51fd\u6570<\/p>\n\n\n\n<p>$$L^{t} \\simeq \\sum_{i=1}^{n} \\left [ l(y_i, \\hat{y}_i^{t-1}) +  g_i f_{t}(x_i) + \\frac{1}{2}h_{i}f_{t}^{2}(x_i) \\right ] + \\Omega (f_{t})$$<\/p>\n\n\n\n<p>\u5176\u4e2d$l(y_i, \\hat{y}_i^{t-1})$\u662f\u4e00\u4e2a\u5e38\u6570\uff0c\u4e8e\u662f\u95ee\u9898\u662f<\/p>\n\n\n\n<p>$$L^{t} \\simeq \\sum_{i=1}^{n} \\left [ g_i f_{t}(x_i) + \\frac{1}{2}h_{i}f_{t}^{2}(x_i) \\right ] + \\Omega (f_{t})$$<\/p>\n\n\n\n<p>\u5176\u4e2d$g_i$\u548c$h_i$\u662f\u5e38\u6570\uff0c$f_{t}$\u662f\u4e00\u68f5\u56de\u5f52\u6570\uff0c\u5b83\u6709$T$\u4e2a\u53f6\u5b50\u8282\u70b9\uff0c\u6bcf\u4e2a\u53f6\u5b50\u8282\u70b9\u8f93\u51fascore\u662f$w_j$\uff0c\u8fd9\u91cc\u5b9a\u4e49\u4e86\u4e00\u4e2a\u96c6\u5408\u7528\u6765\u5206\u914d\u6837\u672c\u5230\u81ea\u5df1\u7684\u5212\u5206\uff08\u53f6\u5b50\u8282\u70b9\uff09$I_{j} = \\{ i | q(x_i) = j \\}$\u610f\u601d\u662f\u8bf4$I_j$\u88c5\u586b\u4e86\u6240\u6709\u88ab\u5212\u5206\u5230\u7b2c$j$\u4e2a\u5212\u5206(\u53f6\u5b50\u8282\u70b9)\u7684\u6837\u672c\u96c6\u5408\u3002\u4e8e\u662f\u8fdb\u4e00\u6b65\u6539\u5199\u76ee\u6807\u51fd\u6570<\/p>\n\n\n\n<p>$$L^{t} \\simeq \\sum_{i=1}^{n} \\left [ g_i f_{t}(x_i) + \\frac{1}{2}h_{i}f_{t}^{2}(x_i) \\right ] + \\gamma T + \\frac{1}{2}\\lambda \\sum_{j=1}^{T}w_j^2 $$<\/p>\n\n\n\n<p>\u4e0a\u9762\u516c\u5f0f\u662f$\\sum_{i=1}^{n}$\u5bf9\u6709\u6837\u672c\u6c42\u548c\uff0c\u8ddf\u4e4b\u524d\u804a\u7684GBDT\u7c7b\u4f3c\uff0c\u53ef\u4ee5\u628a\u516c\u5f0f\u5206\u89e3\u6210\u6dfb\u52a0$T$\u4e2a\u5b50\u51fd\u6570\uff0c\u76f8\u540c\u53f6\u5b50\u8282\u70b9\u4e2d\u7684\u6837\u672c\u8f93\u51fa\u76f8\u540c\u7684$w$\uff0c\u6240\u4ee5\u88ab\u5212\u5206\u5230\u53f6\u5b50\u8282\u70b9$I_j$\u7684\u53ef\u4ee5\u5199\u6210<\/p>\n\n\n\n<p>$$w_j \\sum_{i \\in I_{j}} g_{i} + \\frac{1}{2}\\sum_{i\\in I_{j}}h_i w_j^2 + \\frac{1}{2}\\lambda w_j^2 = w_j \\sum_{i \\in I_{j}} g_{i} + \\frac{1}{2} w_j^2 (\\sum_{i\\in I_{j}}h_i + \\lambda) $$<\/p>\n\n\n\n<p>\u8fd9\u662f\u56e0\u4e3a$f_t(x)$\u7684\u80fd\u529b\u5c31\u662f\u7ed9\u6837\u672c\u5212\u5206\u5230\u67d0\u4e2a\u533a\u57df\uff0c\u5e76\u8fd4\u56de\u4e00\u4e2ascore\uff0c\u5c5e\u4e8e\u6837\u672c\u5212\u5206$I_j$\u7684\u6837\u672c\u5fc5\u7136\u90fd\u8f93\u51fa$w_j$\uff0c\u5373$\\sum_{i \\in I_{j}} g_{i}f_{t}(x_i) = w_j \\sum_{i \\in I_{j}} g_{i} $\uff0c$T$\u7684\u5212\u5206\u7684\u8bdd\u5c31\u662f<\/p>\n\n\n\n<p>$$= \\sum_{j=1}^{T}\\left [ w_j \\sum_{i \\in I_{j}} g_{i} + \\frac{1}{2} w_j^2 (\\sum_{i\\in I_{j}}h_i + \\lambda) \\right ] + \\gamma T$$<\/p>\n\n\n\n<p>$\\gamma T$\u662f\u539f\u6765\u76ee\u6807\u51fd\u6570\u91cc\u7684\u3002\u73b0\u5728\u6211\u4eec\u5148\u56fa\u5b9a$f_t$\u5bf9\u6837\u672c\u7684\u5212\u5206\uff0c\u4e0a\u9762\u516c\u5f0f\u4e2d\u5bf9$w_j$\u6c42\u4e00\u9636\u5bfc\u6570\uff0c\u8fd9\u91cc\u4f1a\u6709$T$\u4e2a$w_j$\uff0c\u6240\u4ee5\u9488\u5bf9\u67d0\u4e2a$w$\uff0c\u5916\u5c42\u7684$\\sum_{j=1}^{T}$\u5c31\u6ca1\u4e86\u3002<\/p>\n\n\n\n<p>$$\\sum_{i \\in I_{j}} g_i + w_j \\left (\\sum_{i \\in I_{j}}h_i + \\lambda \\right )= 0 $$<\/p>\n\n\n\n<p>\u4e8e\u662f\u6b64\u65f6\u53ef\u4ee5\u77e5\u9053\u6700\u4f18$w_j$<\/p>\n\n\n\n<p>$$w_j^* = -\\frac{\\sum_{i \\in I_j} g_i}{\\sum_{i\\in I_j} h_i + \\lambda}$$<\/p>\n\n\n\n<p>\u7b80\u5316\u4e00\u4e0b\u6807\u8bb0\uff0c$G_j = \\sum_{i \\in I_j} g_i, H_j = \\sum_{j \\in I_j} h_j$\uff0c\u5373\u5212\u5206\u5230\u8be5node\u4e2d\u7684\u6837\u672c\u7684Gradients\u548cHessians\uff0c\u5c06\u6700\u6709$w_j$\u5e26\u5165\u76ee\u6807\u51fd\u6570\u53ef\u4ee5\u5f97\u5230\u6700\u4f18\u89e3\u7684\u5f62\u5f0f<\/p>\n\n\n\n<p>$$l^{(t)}(q) = \\sum_{j=1}^{T} \\left [- G_j \\cdot \\frac{G_j}{H_j + \\lambda} +   \\frac{1}{2} \\left ( -\\frac{G_j}{H_j + \\lambda} \\right )^2 (H_j + \\lambda)   \\right ]$$<\/p>\n\n\n\n<p>\u7ed9\u52a0\u53f7\u5de6\u8fb9\u7684\u90e8\u5206\u4e58\u4ee50.5\u5e76\u4e0d\u5f71\u54cd\u4f18\u5316\u76ee\u6807\uff0c\u7b80\u5316\u4e00\u4e0b\u5c31\u662f\u539f\u6587\u7684\u6700\u4f18\u89e3<\/p>\n\n\n\n<p>$$l^{(t)}(q) = &#8211; \\frac{1}{2}\\sum_{j=1}^{T} \\frac{ G_j ^2}{H_j + \\lambda} + \\lambda T$$<\/p>\n\n\n\n<p>\u4e0a\u9762\u8fd9\u4e2a\u6700\u4f18\u89e3\u7684\u610f\u601d\u662f\uff0c\u5728\u7b2c$t$\u8f6e\uff0c\u53ea\u8981\u544a\u77e5\u600e\u4e48\u5212\u5206\u6837\u672c\uff08\u7528\u4efb\u4f55\u65b9\u6cd5\u628a\u6837\u672c\u5212\u5206\u5f00\uff09\uff0c\u5c31\u53ef\u4ee5\u7b97\u51fa\u8be5\u5212\u5206\u65b9\u5f0f\u80fd\u79bb\u76ee\u6807\u503c\u7684\u5dee\u5f02\u60c5\u51b5\uff0c\u7ed9\u4f60\u4e00\u4e2a\u5e38\u6570\u8861\u91cf\u8fd9\u4e2a\u5dee\u5f02\u3002\u4e8e\u662f\u6211\u4eec\u5c31\u53ef\u4ee5\u4f7f\u7528\u8fd9\u4e2a\u5e38\u6570\u6765\u6307\u5bfc\u6784\u9020\u4e8c\u53c9\u6811\u7684\u8fc7\u7a0b\uff0c\u5148\u628a\u4e00\u5806\u6837\u672c\u5206\u6210\u4e24\u5806\uff0c\u518d\u5206\u522b\u5bf9\u4e24\u5806\u518d\u5206\u6210\u4e24\u5806\uff0c\u5373\u6784\u9020\u51b3\u7b56\u6811\u7684\u8fc7\u7a0b\u3002\u663e\u7136\u8fd9\u6837\u6784\u9020\u4e0d\u51fa\u6700\u4f18\u7684\u51b3\u7b56\u6811\uff0c\u6ca1\u529e\u6cd5NP-hard\u95ee\u9898\u3002\u6700\u5f00\u59cb\u53ea\u6709\u4e00\u4e2a\u8282\u70b9\u6807\u8bb0\u6210$I_{0}$\uff0c\u6b64\u65f6\u9884\u4f30\u503c\u548c\u76ee\u6807\u7684\u5dee\u5f02\u662f<\/p>\n\n\n\n<p>$$L_{0} = &#8211; \\frac{1}{2} \\frac{G_{0}^{2}}{H_{0} + \\lambda} + \\lambda \\cdot 1 $$<\/p>\n\n\n\n<p>\u5176\u4e2d$G_{0}$\u8868\u793a\u5c5e\u4e8e\u8282\u70b90\u7684\u6240\u6709\u6837\u672c\u7684\u5c06\u8be5\u8282\u70b9\u5206\u6210\u4e24\u4e2a\u540e\uff0c\u5dee\u5f02\u662f<\/p>\n\n\n\n<p>$$L_{0L} + L_{0R} = &#8211; \\frac{1}{2} \\frac{ G_{0L}^2}{ H_{0L} + \\lambda}  &#8211; \\frac{1}{2} \\frac{ G_{0R}^2}{H_{0R} + \\lambda} + \\lambda \\cdot 2 $$<\/p>\n\n\n\n<p>\u7528\u4e00\u4e2a\u8282\u70b9\u7684\u5dee\u5f02\uff08\u76f4\u89c9\u4e0a\u4f1a\u66f4\u5927\uff09\u51cf\u53bb\u5212\u5206\u6210\u4e24\u4e2a\u8282\u70b9\u7684\u5dee\u5f02\u7b97\u51fa\u6765\u7684\u5c31\u662fgain<\/p>\n\n\n\n<p>$$\\text{Gain}(0) = L_0 &#8211; (L_{0L} + L_{0R}) = \\frac{1}{2} \\left [ \\frac{  G_{0L}^2}{ H_{0L} + \\lambda}  + \\frac{1}{2} \\frac{ G_{0R}^2}{H_{0R} + \\lambda} &#8211; \\frac{G_{0}^{2}}{H_{0} + \\lambda} \\right ] + \\lambda $$<\/p>\n\n\n\n<p>\u5206\u88c2\u540e\u5de6\u8282\u70b9\u662f$\\frac{ G_{0L}^2}{ H_{0L} + \\lambda}$\uff0c\u53f3\u8282\u70b9\u662f$\\frac{1}{2} \\frac{ G_{0R}^2}{H_{0R} + \\lambda}$\uff0c\u5f53\u524d\u672a\u5206\u88c2\u662f$\\frac{G_{0}^{2}}{H_{0} + \\lambda}$,$\\lambda$\u662f\u56e0\u4e3a\u4f60\u589e\u52a0\u4e86\u4e00\u4e2a\u53f6\u5b50\u8282\u70b9\u5bfc\u81f4$f_t$\u53d8\u590d\u6742\u4e86\u7ed9\u4f60\u4e2a\u60e9\u7f5a\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">2. \u6536\u7f29\u7cfb\u6570\u548c\u5217\u91c7\u6837<\/h2>\n\n\n\n<p>\u4e4b\u524d\u5728GBDT\u4e2d\u63d0\u5230\u8fc7\u6536\u7f29\u7cfb\u6570\uff0c\u5c31\u662f\u5728\u5b66\u4e60\u5b8c\u7b2c$t$\u8f6e\u7684\u57fa\u51fd\u6570\u540e\u7ed9\u5b83\u7684\u8f93\u51fa\u53c8\u5f15\u5165\u4e86\u4e00\u4e2a\u7cfb\u6570\uff0c\u8be5\u7cfb\u6570\u8fdb\u4e00\u6b65\u7ea6\u675f\u4e86\u8fd9\u4e00\u8f6e\u7684\u8d21\u732e\uff0c\u73af\u8282overfitting\u3002\u81f3\u4e8e\u5217\u91c7\u6837\u5c31\u662f\u7ea6\u675f\u4e86\u7b2c$t$\u8f6e\u7684\u57fa\u51fd\u6570\u4f7f\u7528\u90a3\u4e9b\u7279\u5f81\uff0c\u8fd9\u4e2a\u65b9\u6cd5\u4e4b\u524d\u5728\u968f\u673a\u68ee\u6797\u4e2d\u6bd4\u8f83\u5e38\u89c1\uff0c\u4e0d\u8fc7\u73b0\u5728\u5927\u5bb6\u4e5f\u90fd\u653e\u5728GBDT\u7684\u5b9e\u73b0\u4e2d\u4e86\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u52a0\u6cd5\u7c7b\u6a21\u578b\u7684\u62df\u5408\u80fd\u529b\u6bd4\u8f83\u5f3a\uff0c\u6240\u4ee5\u63a7\u5236overfitting\u5728GBDT\u4e2d\u5c24\u4e3a\u91cd\u8981\uff0cXGBoost\u7684\u4e3b\u8981\u5de5\u4f5c\u6709\u5982\u4e0b\u51e0\u70b9\uff1a<br \/>\n1. \u4e3a\u6bcf\u4e00\u8f6e\u7684\u76ee\u6807\u51fd\u6570\u589e\u52a0regularization\u4ece\u800c\u7ea6\u675f\u672c\u8f6e\u5b66\u5230\u7684\u5b50\u51fd\u6570\u4e0d\u81f3\u4e8e\u592a\u5f3a\u3002\u7531\u4e8e\u5f15\u5165regularization\u5bfc\u81f4\u539f\u6765\u7684\u76ee\u6807\u51fd\u6570\u53d8\u590d\u6742\uff0c\u4f7f\u7528\u4e86\u4e8c\u9636\u5bfc\u6570\u6765\u8fd1\u4f3c\u6c42\u89e3\u3002<br \/>\n2. \u6536\u7f29\u7cfb\u6570\uff0c\u8fdb\u4e00\u6b65\u7ea6\u675f\u6bcf\u4e2a\u5b50\u51fd\u6570\u7684\u8d21\u732e\u3002<br \/>\n3. \u5217\u91c7\u6837\uff0c\u501f\u9274\u81ea\u968f\u673a\u68ee\u6797\u3002<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,7],"tags":[27,28],"class_list":["post-677","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-papers","tag-gbdt","tag-xgboost"],"_links":{"self":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts\/677","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=677"}],"version-history":[{"count":85,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts\/677\/revisions"}],"predecessor-version":[{"id":780,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts\/677\/revisions\/780"}],"wp:attachment":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=677"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=677"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=677"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}