{"id":48,"date":"2022-02-22T02:48:00","date_gmt":"2022-02-22T02:48:00","guid":{"rendered":"https:\/\/tensor.agenthub.uk\/?p=48"},"modified":"2024-05-17T03:18:27","modified_gmt":"2024-05-17T03:18:27","slug":"sklearn%e4%b8%adgbdt%e7%9a%84%e5%ae%9e%e7%8e%b0","status":"publish","type":"post","link":"https:\/\/tensorzen.blog\/?p=48","title":{"rendered":"GBDT\u6838\u5fc3\u6e90\u7801\u89e3\u6790"},"content":{"rendered":"\n<p>\u3010\u6587\u7ae0\u53d1\u5e03\u7684\u6bd4\u8f83\u65e9\uff0c\u65b0\u7248sklearn\u5df2\u7ecf\u4f7f\u7528Rust\u91cd\u5199\u4e86\uff0c\u53ea\u80fd\u7528\u6765\u51d1\u70ed\u95f9\u4e86\u3011<\/p>\n\n\n\n<p>sklearn\u4e2d\u5bf9GBDT\u7684\u5b9e\u73b0\u662f\u5b8c\u5168\u9075\u4ece\u8bba\u6587 Greedy Function Approximation\u7684\uff0c\u6211\u4eec\u4e00\u8d77\u6765\u770b\u4e00\u4e0b\u662f\u600e\u4e48\u5b9e\u73b0\u7684\u3002GBDT\u6e90\u7801\u6700\u6838\u5fc3\u7684\u90e8\u5206\u5e94\u8be5\u662f\u5bf9Loss Function\u7684\u5904\u7406\uff0c\u56e0\u4e3a\u9664\u53bbLoss\u90e8\u5206\u7684\u4ee3\u7801\u5176\u4ed6\u7684\u90fd\u662f\u975e\u5e38\u76f4\u89c9\u4e14\u6807\u51c6\u7684\u7a0b\u5e8f\u903b\u8f91\uff0c\u53cd\u6b63\u6211\u4eec\u5c31\u4ecesklearn\u5bf9loss\u7684\u5b9e\u73b0\u5f00\u59cb\u770b\u5427\uff5e\uff5e<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"384\" height=\"470\" src=\"https:\/\/tensor.agenthub.uk\/wp-content\/uploads\/2024\/05\/image-4.png\" alt=\"\" class=\"wp-image-590\" style=\"width:162px;height:auto\" srcset=\"https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-4.png 384w, https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-4-245x300.png 245w\" sizes=\"auto, (max-width: 384px) 100vw, 384px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Loss Function \u7684\u5b9e\u73b0<\/h2>\n\n\n\n<p>\u4ee5\u4e8c\u5206\u7c7b\u4efb\u52a1\u4e3a\u4f8b\uff0closs\u91c7\u7528Binomial Deviance\uff0c\u770b\u8fd9\u4e2aloss\u5f88\u964c\u751f\uff0c\u5176\u5b9e\u8ddf\u6211\u4eec\u719f\u6089\u7684negative log-likelihood \/ cross entropy \u662f\u4e00\u56de\u4e8b\uff0c\u56e0\u4e3a\u662f\u4e8c\u5206\u7c7b\u95ee\u9898\u561b\uff0c\u6a21\u578b\u6700\u7ec8\u8f93\u51fa\u5176\u5b9e\u5c31\u662f$P(y=1|x)$\uff0c\u5373\u6837\u672c$x$\u662f\u6b63\u4f8b\u7684\u6982\u7387\uff0c\u6211\u4eec\u628a\u8fd9\u4e2a\u6982\u7387\u6807\u8bb0\u6210$p(x)$\uff0c\u90a3\u4e48Binomial Deviance\u7b49\u4e8e<\/p>\n\n\n\n<p>$$\\ell(y, F(x)) = -\\left [ y\\log(p(x)) + (1 &#8211; y)\\log(1-p(x)) \\right ]$$<\/p>\n\n\n\n<p>\u5c31\uff5e\u8ddfCross Entropy\u662f\u4e00\u4e2a\u4e1c\u897f\uff5e\u5176\u4e2d$p(x)$\u7684\u5f62\u5f0f\u4e0d\u662f\u56fa\u5b9a\u7684\uff0c\u53ea\u8981\u662f\u6982\u7387\u7684\u5f62\u5f0f\u5c31\u53ef\u4ee5\uff0csklearn\u4e2d\u628a$p(x)$\u5b9a\u4e49\u4e3asigmoid\u51fd\u6570<\/p>\n\n\n\n<p>$$p(x) = \\frac{1}{1 + e^{-F(x)}}$$<\/p>\n\n\n\n<p>\u6211\u4eec\u628a\u5b83\u5e26\u5165$\\ell(y,F(x))$\u540e\u4e00\u987f\u7b80\u5316\u5c31\u53d8\u6210\u8fd9\u6837\u4e86<\/p>\n\n\n\n<p>$$\\ell(y,F(x)) = yF(x) &#8211; \\log(1 + e^{F(x)})$$<\/p>\n\n\n\n<p>\u5316\u7b80\u7684\u65f6\u5019\uff0c\u5148\u628a$\\log$\u62c6\u5f00\u518d\u636f\u996c\u636f\u996c\u6700\u540e\u628a$p(x)$\u5e26\u5165\u4f1a\u6bd4\u8f83\u597d\u5316\u7b80\uff5e\uff5e\u5b83\u7684\u8d1f\u68af\u5ea6\u65b9\u5411<\/p>\n\n\n\n<p>$$-\\frac{\\partial \\ell(y,F(x))}{\\partial F(x)} = y + \\frac{1}{1 + e^{-F(x)}}$$<\/p>\n\n\n\n<p>\u5148\u4e0d\u7740\u6025\u5f80\u4e0b\u7ee7\u7eed\u5f80\u4e0b\u770b\u516c\u5f0f\uff0c\u6211\u4eec\u5148\u770b\u4e0bLoss\u548c\u8d1f\u68af\u5ea6\u7684\u5b9e\u73b0\uff0cBinomialDeviance Loss Function\u5b9a\u4e49\u5728src\\ensemble_gb_losses.py \u91cc\uff0c\u7ee7\u627f\u4e86GlassificationLossFunction:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#f6f6f4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#282A36\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"class BinomialDeviance(ClassificationLossFunction):\n    &quot;&quot;&quot;Binomial deviance loss function for binary classification.\n\n    Binary classification is a special case; here, we only need to\n    fit one tree instead of ``n_classes`` trees.\n\n    Parameters\n    ----------\n    n_classes : int\n        Number of classes.\n    &quot;&quot;&quot;\" style=\"color:#f6f6f4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki dracula-soft\" style=\"background-color: #282A36\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #F286C4\">class<\/span><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #97E1F1\">BinomialDeviance<\/span><span style=\"color: #F6F6F4\">(<\/span><span style=\"color: #97E1F1; font-style: italic\">ClassificationLossFunction<\/span><span style=\"color: #F6F6F4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">    <\/span><span style=\"color: #7B7F8B\">&quot;&quot;&quot;Binomial deviance loss function for binary classification.<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    Binary classification is a special case; here, we only need to<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    fit one tree instead of ``n_classes`` trees.<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    Parameters<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    ----------<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    n_classes : int<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">        Number of classes.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    &quot;&quot;&quot;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Loss\u5bf9\u5e94\u7684\u4ee3\u7801\u5b9a\u4e49\u5728__call__\u65b9\u6cd5\u91cc\uff1a<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#f6f6f4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#282A36\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"def __call__(self, y, raw_predictions, sample_weight=None):\n &quot;&quot;&quot;Compute the deviance (= 2 * negative log-likelihood).\n    logaddexp(0, v) == log(1.0 + exp(v))\n &quot;&quot;&quot;\n raw_predictions = raw_predictions.ravel()\n if sample_weight is None:\n  return -2 * np.mean(\n   (y * raw_predictions) - np.logaddexp(0, raw_predictions)\n  )\n else:\n  return (\n   -2\n   \/ sample_weight.sum()\n   * np.sum(\n    sample_weight\n    * ((y * raw_predictions) - np.logaddexp(0, raw_predictions))\n   )\n  )\" style=\"color:#f6f6f4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki dracula-soft\" style=\"background-color: #282A36\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #F286C4\">def<\/span><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #BF9EEE\">__call__<\/span><span style=\"color: #F6F6F4\">(<\/span><span style=\"color: #BF9EEE; font-style: italic\">self<\/span><span style=\"color: #F6F6F4\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">y<\/span><span style=\"color: #F6F6F4\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">raw_predictions<\/span><span style=\"color: #F6F6F4\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">sample_weight<\/span><span style=\"color: #F286C4\">=<\/span><span style=\"color: #BF9EEE\">None<\/span><span style=\"color: #F6F6F4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #7B7F8B\">&quot;&quot;&quot;Compute the deviance (= 2 * negative log-likelihood).<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    logaddexp(0, v) == log(1.0 + exp(v))<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\"> &quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\"> raw_predictions <\/span><span style=\"color: #F286C4\">=<\/span><span style=\"color: #F6F6F4\"> raw_predictions.ravel()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #F286C4\">if<\/span><span style=\"color: #F6F6F4\"> sample_weight <\/span><span style=\"color: #F286C4\">is<\/span><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #BF9EEE\">None<\/span><span style=\"color: #F6F6F4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">  <\/span><span style=\"color: #F286C4\">return<\/span><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #F286C4\">-<\/span><span style=\"color: #BF9EEE\">2<\/span><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #F286C4\">*<\/span><span style=\"color: #F6F6F4\"> np.mean(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">   (y <\/span><span style=\"color: #F286C4\">*<\/span><span style=\"color: #F6F6F4\"> raw_predictions) <\/span><span style=\"color: #F286C4\">-<\/span><span style=\"color: #F6F6F4\"> np.logaddexp(<\/span><span style=\"color: #BF9EEE\">0<\/span><span style=\"color: #F6F6F4\">, raw_predictions)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">  )<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #F286C4\">else<\/span><span style=\"color: #F6F6F4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">  <\/span><span style=\"color: #F286C4\">return<\/span><span style=\"color: #F6F6F4\"> (<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">   <\/span><span style=\"color: #F286C4\">-<\/span><span style=\"color: #BF9EEE\">2<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">   <\/span><span style=\"color: #F286C4\">\/<\/span><span style=\"color: #F6F6F4\"> sample_weight.sum()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">   <\/span><span style=\"color: #F286C4\">*<\/span><span style=\"color: #F6F6F4\"> np.sum(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">    sample_weight<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">    <\/span><span style=\"color: #F286C4\">*<\/span><span style=\"color: #F6F6F4\"> ((y <\/span><span style=\"color: #F286C4\">*<\/span><span style=\"color: #F6F6F4\"> raw_predictions) <\/span><span style=\"color: #F286C4\">-<\/span><span style=\"color: #F6F6F4\"> np.logaddexp(<\/span><span style=\"color: #BF9EEE\">0<\/span><span style=\"color: #F6F6F4\">, raw_predictions))<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">   )<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">  )<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>\u6ce8\u610f\uff0c\u8fd9\u91cc\u4e58\u4ee52\u53ef\u80fd\u8d77\u5230\u653e\u5927loss\u7684\u4f5c\u7528\uff0c\u56e0\u4e3a\u540e\u9762\u8d1f\u68af\u5ea6\u7684\u8ba1\u7b97\u4e2d\u5e76\u6ca1\u6709\u628a2\u653e\u8fdb\u53bb\uff0c\u8d1f\u68af\u5ea6\u7684\u5b9e\u73b0\uff1a<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#f6f6f4;--cbp-line-number-width:calc(1 * 0.6 * .75rem);line-height:1rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#282A36\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"def negative_gradient(self, y, raw_predictions, **kargs):\n    &quot;&quot;&quot;Compute half of the negative gradient.\n    &quot;&quot;&quot;\n    return y - expit(raw_predictions.ravel())\" style=\"color:#f6f6f4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki dracula-soft\" style=\"background-color: #282A36\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #F286C4\">def<\/span><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #62E884\">negative_gradient<\/span><span style=\"color: #F6F6F4\">(<\/span><span style=\"color: #BF9EEE; font-style: italic\">self<\/span><span style=\"color: #F6F6F4\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">y<\/span><span style=\"color: #F6F6F4\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">raw_predictions<\/span><span style=\"color: #F6F6F4\">, <\/span><span style=\"color: #F286C4\">**<\/span><span style=\"color: #FFB86C; font-style: italic\">kargs<\/span><span style=\"color: #F6F6F4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">    <\/span><span style=\"color: #7B7F8B\">&quot;&quot;&quot;Compute half of the negative gradient.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    &quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">    <\/span><span style=\"color: #F286C4\">return<\/span><span style=\"color: #F6F6F4\"> y <\/span><span style=\"color: #F286C4\">-<\/span><span style=\"color: #F6F6F4\"> expit(raw_predictions.ravel())<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>expit\u65b9\u6cd5\u6765\u81eascipy\u4e2d\uff0c\u8ba1\u7b97sigmoid\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u66f4\u65b0\u53f6\u5b50\u7ed3\u70b9<\/h2>\n\n\n\n<p>\u5728\u4e0a\u4e00\u7bc7\u6587\u7ae0<a href=\"https:\/\/tensor.agenthub.uk\/?p=280\" data-type=\"post\" data-id=\"280\">GBDT\u539f\u7406<\/a>\u4e2d\u8bf4\u8fc7\uff0c\u5728\u4e00\u8f6e\u8fed\u4ee3\u4e2dregression tree\u4f1a\u6839\u636e\u8d1f\u68af\u5ea6\u65b9\u5411\u628a\u6837\u672c\u5212\u5206\u5230$J$\u4e2a\u4e0d\u76f8\u4ea4\u7684\u7a7a\u95f4\u91cc\u53bb\uff0c\u4e4b\u540e\u5206\u522b\u5bf9$J$\u4e2a\u7a7a\u95f4\u8df5\u884c\u4f18\u5316<\/p>\n\n\n\n<p>$$\\gamma_{jm} = \\arg \\min_{\\gamma} \\sum_{x_i \\in R_{jm}} y_i (F_{m-1}(x_i) + \\gamma) &#8211; \\log ( 1 + e^{F_{m-1}(x_i) + \\gamma})$$<\/p>\n\n\n\n<p>$F_{m-1}(x_i)$\u5728\u4ee3\u7801\u91cc\u9762\u53ebraw_predictions\uff0c\u5728\u672c\u8f6e\u8fed\u4ee3\u4e2d\u662f\u5df2\u77e5\u7684\u3002\u78b0\u5230\u6c42\u6700\u503c\uff0c\u80af\u5b9a\u76f4\u63a5\u4e00\u9636\u5bfc\u6570\u7b49\u4e8e0\u5440\uff0c\u800c\u4e14\u5f53\u524d\u60c5\u51b5\u4e0b\u662f\u6709\u89e3\u6790\u89e3\u7684\uff5e<\/p>\n\n\n\n<p><strong>\u4f46\u662f\u5f88\u9057\u61be\uff5e\uff5e<\/strong><\/p>\n\n\n\n<p>sklearn\u5e76\u6ca1\u6709\u91c7\u7528\u89e3\u6790\u89e3\uff0c\u800c\u662f\u7528\u4e8c\u9636\u6cf0\u52d2\u5c55\u5f00\u8fdb\u884c\u8fd1\u4f3c\u8ba1\u7b97\uff0c\u8bba\u6587\u8bf4\u662f\u671d\u7740\u725b\u987f\u65b9\u5411\u8fed\u4ee3\u4e00\u6b21\uff0c\u5176\u5b9e\u662f\u4e00\u4e2a\u610f\u601d\u3002\u5927\u6982\u4f60\u4e5f\u77e5\u9053\u6211\u5176\u5b9e\u4e0d\u4fe1\u90aa\uff0c\u4e8e\u662f\u5b9e\u9a8c\u4e86\u51e0\u628a\uff0c\u53d1\u73b0\u540c\u6837\u8fed\u4ee35\u8f6e\u4f7f\u7528\u725b\u987f\u6cd5\u66f4\u65b0\u53f6\u5b50\u7ed3\u70b9Loss\u8870\u51cf\u7684\u66f4\u5feb\uff0cAUC\u4e5f\u4f1a\u66f4\u597d\u70b9\uff5e\uff5e\u5982\u679c\u4e0d\u719f\u6089\u725b\u987f\u6cd5\u53ef\u4ee5\u53c2\u8003\u6211\u4e4b\u524d\u5199\u7684\u6587\u7ae0\u725b\u987f\u6cd5\uff0c\u770b\u5b8c\u4e86\u4f60\u5927\u6982\u4f1a\u77e5\u9053\u4e3a\u5565$\\gamma$\u4f1a\u7b49\u4e8e<\/p>\n\n\n\n<p>$$\\gamma = \\frac{g(\\gamma)}{h(\\gamma)}$$<\/p>\n\n\n\n<p>$g(\\gamma)$\u548c$h(\\gamma)$\u5206\u522b\u662fgradient\u548chessian matrix\uff0c\u5373\u4e00\u9636\u4e8c\u9636\u5bfc<\/p>\n\n\n\n<p>$$g(\\gamma) = \\frac{\\partial \\ell(y, \\gamma)}{\\partial \\gamma} = y &#8211; \\text{sigmoid}(F_{m-1}(x_i) + \\gamma)$$<\/p>\n\n\n\n<p>$$h(\\gamma) = \\frac{\\partial^{2} \\ell(y, \\gamma)}{\\partial \\gamma ^2} =  (1 &#8211; \\text{sigmoid}(F_{m-1}(x_i) + \\gamma))  \\text{sigmoid}(F_{m-1}(x_i) + \\gamma)$$<\/p>\n\n\n\n<p>\u4ee3\u7801\u5b9e\u73b0\u5728\u65b9\u6cd5_update_terminal_region\u91cc:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.75rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#f6f6f4;--cbp-line-number-width:calc(2 * 0.6 * .75rem);line-height:1rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#282A36\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"\ndef _update_terminal_region(self,tree,terminal_regions,leaf,X,y,residual,raw_predictions,sample_weight,):\n    &quot;&quot;&quot;Make a single Newton-Raphson step.\n\n    our node estimate is given by:\n\n        sum(w * (y - prob)) \/ sum(w * prob * (1 - prob))\n\n    we take advantage that: y - prob = residual\n    &quot;&quot;&quot;\" style=\"color:#f6f6f4;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki dracula-soft\" style=\"background-color: #282A36\" tabindex=\"0\"><code><span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #F286C4\">def<\/span><span style=\"color: #F6F6F4\"> <\/span><span style=\"color: #62E884\">_update_terminal_region<\/span><span style=\"color: #F6F6F4\">(<\/span><span style=\"color: #BF9EEE; font-style: italic\">self<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">tree<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">terminal_regions<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">leaf<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">X<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">y<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">residual<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">raw_predictions<\/span><span style=\"color: #F6F6F4\">,<\/span><span style=\"color: #FFB86C; font-style: italic\">sample_weight<\/span><span style=\"color: #F6F6F4\">,):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F6F6F4\">    <\/span><span style=\"color: #7B7F8B\">&quot;&quot;&quot;Make a single Newton-Raphson step.<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    our node estimate is given by:<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">        sum(w * (y - prob)) \/ sum(w * prob * (1 - prob))<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    we take advantage that: y - prob = residual<\/span><\/span>\n<span class=\"line\"><span style=\"color: #7B7F8B\">    &quot;&quot;&quot;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>\u597d\u4e86\uff0c\u6838\u5fc3\u7684Loss\u90e8\u5206\u6211\u4eec\u5c31\u770b\u5b8c\u4e86\uff0c\u7528\u4e8eGBDT\u7684Loss Function\u5927\u6982\u7ed3\u6784\u90fd\u5dee\u4e0d\u591a\u7684\u3002\u9664\u53bbloss\u90e8\u5206\uff0c\u5176\u4ed6\u4ee3\u7801\u90fd\u662f\u6bd4\u8f83\u76f4\u89c9\u7684\uff0c\u5c31\u4e0d\u4e00\u4e00\u7ec6\u8bf4\u4e86\uff0c\u5927\u5bb6\u90fd\u662f\u6210\u5e74\u4eba\u4e86\uff0c\u5e94\u8be5\u770b\u5f97\u61c2\uff5e\uff5e \u6211\u628a\u8bad\u7ec3\u548c\u9884\u6d4b\u9636\u6bb5\u7684\u4e3b\u8981\u6d41\u7a0b\u56fe\u7ed9\u753b\u51fa\u6765\u8d34\u5728\u4e0b\u9762\uff0c\u60a8\u6709\u5174\u8da3\u5c31\u6162\u6162\u54c1\u5427\uff5e\u5927\u6982\u5982\u6b64<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u8bad\u7ec3\u9636\u6bb5\u7684\u6e90\u7801\u5b9e\u73b0<\/h2>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"828\" src=\"https:\/\/tensor.agenthub.uk\/wp-content\/uploads\/2024\/05\/image-5-1024x828.png\" alt=\"\" class=\"wp-image-599\" srcset=\"https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-5-1024x828.png 1024w, https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-5-300x243.png 300w, https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-5-768x621.png 768w, https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-5.png 1080w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>\u9884\u6d4b\u9636\u6bb5\u7684\u6e90\u7801\u5b9e\u73b0<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"472\" height=\"790\" src=\"https:\/\/tensor.agenthub.uk\/wp-content\/uploads\/2024\/05\/image-6.png\" alt=\"\" class=\"wp-image-600\" srcset=\"https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-6.png 472w, https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-6-179x300.png 179w\" sizes=\"auto, (max-width: 472px) 100vw, 472px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"480\" height=\"443\" src=\"https:\/\/tensor.agenthub.uk\/wp-content\/uploads\/2024\/05\/image-7.png\" alt=\"\" class=\"wp-image-601\" srcset=\"https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-7.png 480w, https:\/\/tensorzen.blog\/wp-content\/uploads\/2024\/05\/image-7-300x277.png 300w\" sizes=\"auto, (max-width: 480px) 100vw, 480px\" \/><\/figure>\n\n\n\n<p>\u53c2\u8003\u6587\u732e\uff1a<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li class=\"has-small-font-size\">Friedman J H . Greedy Function Approximation: A Gradient Boosting Machine[J]. Annals of Statistics, 2001, 29(5):1189-1232.<\/li>\n\n\n\n<li class=\"has-small-font-size\">Jerome F , Robert T , Trevor H . Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)[J]. Ann. Statist. 2000.(FHT00)<\/li>\n\n\n\n<li class=\"has-small-font-size\">sklearn\u5b98\u65b9\u6587\u6863<\/li>\n\n\n\n<li class=\"has-small-font-size\">sklearn \u6e90\u7801<\/li>\n<\/ol>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u3010\u6587\u7ae0\u53d1\u5e03\u7684\u6bd4\u8f83\u65e9\uff0c\u65b0\u7248sklearn\u5df2\u7ecf\u4f7f\u7528Rust\u91cd\u5199\u4e86\uff0c\u53ea\u80fd\u7528\u6765\u51d1\u70ed\u95f9\u4e86\u3011 sklearn\u4e2d\u5bf9GBDT\u7684\u5b9e\u73b0\u662f\u5b8c\u5168\u9075\u4ece\u8bba\u6587 Greedy Function Approximation\u7684\uff0c\u6211\u4eec\u4e00\u8d77\u6765\u770b\u4e00\u4e0b\u662f\u600e\u4e48\u5b9e\u73b0\u7684\u3002GBDT\u6e90\u7801\u6700\u6838\u5fc3\u7684\u90e8\u5206\u5e94\u8be5\u662f\u5bf9Loss Function\u7684\u5904\u7406\uff0c\u56e0\u4e3a\u9664\u53bbLoss\u90e8\u5206\u7684\u4ee3\u7801\u5176\u4ed6\u7684\u90fd\u662f\u975e\u5e38\u76f4\u89c9\u4e14\u6807\u51c6\u7684\u7a0b\u5e8f\u903b\u8f91\uff0c\u53cd\u6b63\u6211\u4eec\u5c31\u4ecesklearn\u5bf9loss\u7684\u5b9e\u73b0\u5f00\u59cb\u770b\u5427\uff5e\uff5e Loss Function \u7684\u5b9e\u73b0 \u4ee5\u4e8c\u5206\u7c7b\u4efb\u52a1\u4e3a\u4f8b\uff0closs\u91c7\u7528Binomial Deviance\uff0c\u770b\u8fd9\u4e2aloss\u5f88\u964c\u751f\uff0c\u5176\u5b9e\u8ddf\u6211\u4eec\u719f\u6089\u7684negative log-likelihood \/ cross entropy \u662f\u4e00\u56de\u4e8b\uff0c\u56e0\u4e3a\u662f\u4e8c\u5206\u7c7b\u95ee\u9898\u561b\uff0c\u6a21\u578b\u6700\u7ec8\u8f93\u51fa\u5176\u5b9e\u5c31\u662f$P(y=1|x)$\uff0c\u5373\u6837\u672c$x$\u662f\u6b63\u4f8b\u7684\u6982\u7387\uff0c\u6211\u4eec\u628a\u8fd9\u4e2a\u6982\u7387\u6807\u8bb0\u6210$p(x)$\uff0c\u90a3\u4e48Binomial Deviance\u7b49\u4e8e $$\\ell(y, F(x)) = -\\left [ y\\log(p(x)) + (1 &#8211; y)\\log(1-p(x)) \\right [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5,4],"tags":[27,31,32],"class_list":["post-48","post","type-post","status-publish","format-standard","hentry","category-coding","category-machine-learning","tag-gbdt","tag-lightgbm","tag-32"],"_links":{"self":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts\/48","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=48"}],"version-history":[{"count":10,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts\/48\/revisions"}],"predecessor-version":[{"id":602,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=\/wp\/v2\/posts\/48\/revisions\/602"}],"wp:attachment":[{"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=48"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=48"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tensorzen.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=48"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}