In the updated Gymnasium environment interface, the distinction between “terminated” and “truncated” provides more clarity on why an episode ended, which is useful for more nuanced reinforcement learning […]
Max Heap Sort
A max-heap viewed as (a) a binary tree and (b) an arry. The root of the tree is A[1], and given the index i of a node, there’s […]
PRML Chapter 1
1.1 Example: Polynomial Curve Fitting Now suppose that we are given a training set comprising $N$ observations of $x$, written $\textbf{x} = (x_1, …, x_N)^{T}$ ,tother with corresponding […]
Mathematical notation
Vectors are denoted by lower case bold Roman letters such as $\textbf{x}$, and all vectos are assumed to be column vectors. A superscript $T$ denotes the transpose of […]
GBDT核心源码解析
【文章发布的比较早,新版sklearn已经使用Rust重写了,只能用来凑热闹了】 sklearn中对GBDT的实现是完全遵从论文 Greedy Function Approximation的,我们一起来看一下是怎么实现的。GBDT源码最核心的部分应该是对Loss Function的处理,因为除去Loss部分的代码其他的都是非常直觉且标准的程序逻辑,反正我们就从sklearn对loss的实现开始看吧~~ Loss Function 的实现 以二分类任务为例,loss采用Binomial Deviance,看这个loss很陌生,其实跟我们熟悉的negative log-likelihood / cross entropy 是一回事,因为是二分类问题嘛,模型最终输出其实就是$P(y=1|x)$,即样本$x$是正例的概率,我们把这个概率标记成$p(x)$,那么Binomial Deviance等于 $$\ell(y, F(x)) = -\left [ y\log(p(x)) + (1 – y)\log(1-p(x)) \right […]
Multi-Head Attention 计算过程
直觉的理解Attention和Multi-Head Attention的计算过程,然后咱们用NumPy来实现下。
XGBoost自定义目标函数
xgboost内置了足够丰富的目标函数(objective function),正常来说是能够应付日常需求的,如果~万一~你有特殊需求,它也可以自定义目标函数,或者叫损失函数(loss function),这里介绍下怎么自定义目标函数。
这大概是GBDT最初的想法
GBDT(Gradient Boosting Decision Tree)梯度提升决策树,看到Gradient不免会想到梯度下降,所以我们从梯度下降开始聊。