In Stable Baseline3, when using environments like ‘SubprocVecEnv’ for parallel environment management, the mean reward isn’t displayed by default during the training phase. This is because ‘SubprocVecEnv’ runs […]
The distinction between “terminated” and “truncated” in RL
In the updated Gymnasium environment interface, the distinction between “terminated” and “truncated” provides more clarity on why an episode ended, which is useful for more nuanced reinforcement learning […]
Max Heap Sort
A max-heap viewed as (a) a binary tree and (b) an arry. The root of the tree is A[1], and given the index i of a node, there’s […]
PRML Chapter 1
1.1 Example: Polynomial Curve Fitting Now suppose that we are given a training set comprising $N$ observations of $x$, written $\textbf{x} = (x_1, …, x_N)^{T}$ ,tother with corresponding […]
Mathematical notation
Vectors are denoted by lower case bold Roman letters such as $\textbf{x}$, and all vectos are assumed to be column vectors. A superscript $T$ denotes the transpose of […]
GBDT核心源码解析
【文章发布的比较早,新版sklearn已经使用Rust重写了,只能用来凑热闹了】 sklearn中对GBDT的实现是完全遵从论文 Greedy Function Approximation的,我们一起来看一下是怎么实现的。GBDT源码最核心的部分应该是对Loss Function的处理,因为除去Loss部分的代码其他的都是非常直觉且标准的程序逻辑,反正我们就从sklearn对loss的实现开始看吧~~ Loss Function 的实现 以二分类任务为例,loss采用Binomial Deviance,看这个loss很陌生,其实跟我们熟悉的negative log-likelihood / cross entropy 是一回事,因为是二分类问题嘛,模型最终输出其实就是$P(y=1|x)$,即样本$x$是正例的概率,我们把这个概率标记成$p(x)$,那么Binomial Deviance等于 $$\ell(y, F(x)) = -\left [ y\log(p(x)) + (1 – y)\log(1-p(x)) \right […]
Multi-Head Attention 计算过程
直觉的理解Attention和Multi-Head Attention的计算过程,然后咱们用NumPy来实现下。
XGBoost自定义目标函数
xgboost内置了足够丰富的目标函数(objective function),正常来说是能够应付日常需求的,如果~万一~你有特殊需求,它也可以自定义目标函数,或者叫损失函数(loss function),这里介绍下怎么自定义目标函数。